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DEVELOPMENT OF A VECTOR TO TARGET GENE EXPRESSION 
TO THE EProERMIS OF TRANSGENIC ANIMAI^ 

This invention was partially supported by grants from the United 
States government under HD25479, AI30283 and CA52607 awarded by the 
National Institutes of Health. Further, this work was partially performed 
at the National Institutes of Health in the Laboratory of Cellular 
Carcinogenesis and Tmnor Promotion, Division of Cancer Etiology, 
National Cancer Institute. The government has certain rights in the 
invention. 

FIELD OF THE INVENTION 

The present invention relates to expression vectors for use in 
expressing polypeptides in epidermal cells of transgenic animals. More 
particularly it relates to vectors containing the Kl keratin gene promoter, 
its 6' flanking region, its 5' transcribed but untranslated region, its first 
intron and intron/exon boundaiy, its 3' transcribed but imtranslated 
region, its contiguous non-coding DNA containing the gene's natural 
transcriptional termination region and its 3' flanking region. 

BACKGROUND OF THE INVENTION 

The ability to stably introduce genes into the germline of mice has 
greatly enhanced prospects for the generation of animal models of human 
diseases (Palmiter and Brinster, Aim. Rev. Genet., Vol. 20, pp. 465-499 
(1986)). The need for such animal models is becoming increasingly 
apparent as novel pharmaceuticals are developed which are specifically 
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designed to iiihibit expression of human viruses or counteract the effect 
of mutated genes that occur in human diseases. Current ef&cacy 
assessments of these new therapeutic agents are restricted to in vitro 
models which do not allow evaluation of deliveiy routes nor assessment 
of other factors known to sSect disease processes in vivo, such as blood 
supply, an intact immune system, hxunoral and ceU-mediated growth 
controls and physical barriers to disease progression. In addition, the 
prospects for utilizing gene therapy to treat human disorders are coming 
closer to reality. Therefore, animal models of human diseases would be 
useful to assess the therapeutic potential of these approaches. The 
epidermis is an attractive tissue for the development animal models since 
it serves as a ^neral model for other squamotis epithelia and its 
accessibility allows macroscopic observation of pathological events and 
easy assessment of therapeutic potential The development of a vector 
which specifically targets gene expression to the epideimis of transgenic 
pniTTtfllfi is the subject of this invention. 

The epidermis is a continuously regenerating stratified squamous 
epithelimn. Differentiated epidermal cells are the progeny of proliferative 
cells located in the basal cell layer and there is substantial evidence 
suggesting that the regeneration process occurs in proliferative units 
composed of slowly cycling, self-renewing stem cells,^^ proliferative but non- 
renewing transit ampli^^g cells, and post-mitotic matiuing epidermal 
cells aversen, et al,. Cell Tissue Enet., Vol 1, pp. 351-367, (1968); 
MacKenzie, et al., Nature, Vol 226, pp. 653-655, (1970); Christophers, et 
al., J. Invest. Dermatol., VoL 56, pp. 165-170, (1971); Potten, In Stem 
Cells: Their Identification and Characterization, pp. 200-232, (1983); 
Cotsarelis, et aL, Cell, Vol. 61, pp. 1329-1337, (1990)). The maturation 
process (terminal differentiation) is initiated when epidermal cells 
withdraw firom the cell cycle and migrate from the basal layer into the 
spinous layer. Maturation continues as spinous ceUs migrate into the 
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granular layer and terminates with the formation of the stratum comexmi. 
Morphological and biochemical studies have shown that terminal 
differentiation occurs in stages. (Matoltsy, J. Invest- Dermatol., Vol. 65, 
pp. 127-142, (1975)), Keratins K5 and K14 are major products of basal 
epidermal cells (Woodcock-Mitchell, et al., J, Cell BioL, Vol. 95, pp. 580- 
588, (1982)). These proteins assemble into 10 nm filaments (intermediate 
filaments [IF]) and, together with microtubules (tubulin) and 
microfilaments (actin), comprise the cytoskeleton of epidermal ceUs 
(Steinert, P.M., et al., CeU, Vol. 42, pp. 411-419, (1985)). One of the 
earliest changes associated with the commitment to differentiation and 
migration into the spinous layer is the induction of another 
differentiation-specific pair of keratins (Kl and KIO). IF containing Kl 
and KIO replace those containing K5 and K14 as the major products of 
cells in the spinous layer (Woodcock-Mitchell, et al., J. Cell Biol., Vol. 95, 
pp. 580-588, (1982); Roop, et al., Proc. Natl. Acad. Sci., USA, Vol. 80, pp. 
716-720, (1983); Schweizer, et al, Cell, Vol. 37, pp. 159-170, (1984)). The 
keratin IF formed by these proteins assemble into bimdles. In the 
granular layer, another high molecular weight non-IF protein is 
synthesized, which is processed into filaggrin, and is thought to promote 
keratin filament aggregation and disulfide-bond formation (Dale, B A., et 
al.. Nature, Vol. 276, pp. 729-731, (1978); Harding, C.R., et al., J. Mol. 
BioL, Vol. 170, pp. 651-673, (1983)). In the final stage of epidermal cell 
maturation, transglutaminase cataljrzes the crosslinking of involucrin and 
loricrin, by the formation of (y-glutamyl) lysine isopeptides, into a highly 
insoluble comified envelope which is located just beneath the plasma 
membrane (Rice and Green, Cell Vol. II, pp. 417-422 (1977) Mehrel, et al., 
Cell, Vol. 61, pp. 1103-1112, (1990)). 

Genes or cDNAs encoding the major keratins expressed in 
epidermal cells have now been cloned: K5 (Lersch, et al., Mol. and Cell 
BioL, Vol. 8, pp. 486-493, (1988), K14 (Marchuk, et al., Proc. Natl. Acad. 
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Sci, USA, VoL 82, pp. 1609-1613, (1985); Knapp, et aL, J. Biol. Chem, Vol. 
262, pp. 938-945, (1987); Roop,'et aL, Cancer Res., Vol. 48, pp. 3245-3252, 
(1988), Kl (Steinert, et al., X BioL Ghem., Vol. 260, pp. 7142-7149, (1985) 
and KIO (Krieg, et al., J. BioL Chem., Vol. 260, pp. 5867-5870, (1985)). 
Northern blot analysis and in situ hybridization studies suggest that 
keratin genes K5 and K14 are predominantly transcribed in the 
proliferating basal layer and transcription of keratin genes Kl and KIO is 
induced as cells migrate into the spinous layer (Lersch, et al., Mol. and 
Cell Biol., Vol. 8, pp. 486-493, (1988); Knapp, et al., J, BioL Chem., VoL 
262, pp. 938-945, a987); Roop, et al.. Cancer Res., Vol. 48, pp. 3245-3252, 
(1988)). Genes encoding rat (Haydock, et al., J. Biol. Chem., VoL 261, pp. 
12520-12525, (1986)) and mouse (Rdthnagel, et al., J. BioL Chem., VoL 
262, pp. 15643-15648, (1987)) filaggrin have now been identified and in 
situ hybridizatiQix experiments have confirmed that transcription of this 
gene is restricted to the granular layer (Rothnagel, et al, J. BioL Chem., 
VoL 262, pp. 15643-15648, (1987); Fisher, et al. J. Invest. DermatoL, VoL 
88, pp. 661-664, (1987)). To date, loricrin is the only gene encoding a 
component of the comified envelope to be studied at the molecular level 
by in situ hybridization and tran^cnripte of this gene are restricted to the 
granular layer (Mehrel, et al.. Cell, VoL 61, pp. 1103-1112, (1990)). 

From this description of gene expression in the epidermis, there 
would appear to be many candidate genes from which to choose for 
targeting to the epidermis. However, this is not the case. Keratins K5 
and K14, expressed in the proliferative compartment of the epidermis, are 
not only expressed in the epidermis but in all squamous epitheha. 
Furthermore, these genes are expressed early in development (Dale and 
Holbrook, In: Current Topics in Developmental Biology, pp. 127-151, 
(1987)) and this could cause lethality in utero. The generation of animal 
models of hyperproliferative diseases such as cancer and psoriasis would 
most likely requir e^^ression in the basal compartment in cells with 
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proliferatdve potential, therefore, genes expressed post-mitoticaily such as 
keratins Kl and KIO and those encoding filaggrin and components of the 
cell envelope (involucrin and loricrin) would be excluded on this basis. 
The present invention, however, demonstrates that a 12 kb fragment of 
the human keratin gene Kl (HKl) contains sequences regulating tissue 
and developmental specific expression in transgenic mice. Tins fragment 
lacks sequences responsive to negative control of differentiation specific 
expression resulting in expression of the HKl gene in some cells of the 
basal cell compartment of the epidermis. Although regulatory elements 
of the HKl gene fail to completely mimic the expression pattern of the 
endogenous mouse Kl gene, they are ideally suited for targeting gene 
expression for the following reasons: (1) expression only occtirs in the 
epidermis and not other squamotis epitheUa; (2) expression occurs at a 
late stage of development (day 15) and, therefore, is imlikely to result in 
. lethality it utero; (3) expression occurs in a large proportion of basal cells 
that have proliferative potential. 

SUMMARY OF THE INVENTION 
An object of the present invention is a keratin Kl vector for 

expressing nucleic acid sequences in the epidermis. 

An additional object of the present invention is a keratin Kl vector 

containing an oncogene. 

A further object of the present invention is a bioreactor for 

producing proteins, polypeptides and antisense RNA in transduced 

epidermal cells. 

An additional object of the present invention is an in vivo method 
of transducing epidermal cells with a keratin Kl vector. 

A further object of the present invention is provision of a transgenic 
animal containing the keratin Kl epidermal vector. 
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An additional object of the present invention is the provision of a 
transgenic animal for the study of cancer. 

Another object of the present invention is a method of treating skin 

tilcers. 

A further object of the present invention is an enhanced method of 
wound healing or hdaling of surgical incisions. 

An additional object of the present invention is a method of treating 
psoriasis. 

An additional object of the present invention is a method of treating 
skin cancer. 

An additional object of the present invention is a vaccination 
procedure using the keratin Kl epidermal vector. 

Thus, in accomplishing the foregoing objects, there is provided in 
accordance with one aspect of the present invention a keratin Kl vector 
for expression of a nucleic acid cassette in the epidermis comprising: a 5' 
flanking region of the keratin Kl gene, said 5' flanking sequence 
including a keratin Kl promoter, a 5' transcribed but untranslated region 
and a first intron and an intron/exon boxmdary all in sequential and 
positional relationship for expression of a nucleic acid cassette; a 3' 
flanking region of the keriatin Kl gene containing regulatory sequences, 
said 3' flanking region including a 3' transcribed but untranslated region 
and contiguous noncoding DNA containing a transcriptional termination 
region; and a polylinker Ixaving a plurality of restriction endonuclease 
sites, said polyHnker connecting the 5' flanking region to the 3' flanking 
region and said po^linker further providing a position for insertion of the 
nucleic acid cassette. 

In specific embodiments of the present invention the keratin Kl 
vector has a 5' flanking region of approximately 1.2 kb and a 3^ flanking 
region of approximately 3.9 kb. 
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In alternative embodiments of the present invention there is a 
further addition of approximately 8.0 kb of 5' flanking sequence from the 
18 kb Eco RV fragment onto the end of the vector. 

In another alternative embodiment the Vitamin D3 regulatory 
element within the htunan Kl keratin gene is identified and utilized to 
suppress expression of the keratin Kl vector. 

In the specific embodiments of the present invention the keratin Kl 
vector is used to transduce epidermal cells to form bioreactors. The 
bioreactors produce a variety of proteins, polypeptides and RNAs. 
Additionally, the vector can be used to form transgenic animals. The 
transgenic animals can be used to study cancer, drug reactions and 
treatments. 

The keratin Kl vector can also be used for the treatment of a 
variety of diseases, including wounds, surgical incisions, psoriasis, skin 
ulcers and skin cancer and can be used for the production of vaccines. 

Other and further objects, features and advantages will be apparent 
from the following description of the presently preferred embodiments of 
the invention which are given for the purposes of disclosure when taken 
in coiyunction with the accompatrytng drawings. 

BRIEF DESCR IPTICN OF THE DRAWINGS 
Eigure 1 is a schematic drawing of the human keratin Kl gene (HKl) and 
the expression vector derived from its regulatory sequences. 
Figure 2 shows the expression characteristics of the HKl vector in vivo in 
transgenic mice utilizing a reporter gene encoding E. coli p-galactosidase. 
Figure 3 demonstrates the suppression of the SV40 promoter by a novel 
negative regulatory element from the HKl gene (HIQ.NRE) in the 
presence of Vitamin D3. 

Figure 4 is a schematic drawing of the HKl vector containing the coding 
sequence of v-ras"" protein of Harvey Mmine Sarcoma Virus. 



SUBSTITUTE SHEET 



wo 93/22430 



PCT/US93/03985 



-8- 

Figure 6 is a schematic drawing of the HKl vector containing the coding 
sequence of the v-fos protein from a FBJ/PBR chimeric plasmid. 
Figure 6 is a schematic drawing of the HKl expression vector containing 
the codings sequences of the E6 and E7 proteins from human papilloma 
virus 18, 

Figure 7 is a schematic drawing of the HIQ vector containing the coding 
sequence of TGF-a* 

Figure 8 is a schematic drawing of the HKl vector containing the coding 
sequence of the trans-regulatoiy protein tat, from htunan 
immunodeficiency virus. 

Figure 9 is a schematic drawing of an ISkb Eco RV fragment containing 
the HKl gene. 

Figure 10 is a schematic drawing of a derivative of the HKL vector 
containing additional 5' flanking sequences which restrict expression to 
- differentiated epidermal ceUs. 

The drawings are not necessarily to scale, and certain features of 
the invention may he exaggerated in scale and shown in schematic form 
in the interest of clarity and conciseness. 

DETAILED DESCRIPTION OF THE INVENTION 
It will be readily apparent to one skilled in the art that varying 
substitutions and modifications may be made to the invention disclosed 
herein without departing from the scope and spirit of the invention. 

The term "transformed" as used herein refers to the process or 
mechanism of inducing changes in the characteristics (expressed 
phenotype) of a cell by the mechanism of gene transfer whereby DNA is 
introduced intb a cell in a form where it ^q>resses a specific gene product 
or alters expression of endogenous gene products. 

The term "transduction" as used herein refers to the process of 
introducing a DNA expression vector into a cell. Various methods of 
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transduction are possible, including microinjection, CaPO^, lipofection 
(lysosome fusion), use of a gene gun and DNA vector transporter. The 
human keratin Kl vector can be transduced into the epidermal cells by 
any of the variety of ways described above. 

The term "DNA vector transporter" as used herein refers to those 
molecules which bind to DNA vectors and are capable of being taken up 
by epidermal cells. DNA transporter is a molecular complex capable of 
non-covalent binding to DNA and efficiently transporting the DNA 
through the cell membrane. Although not necessary, it is preferable that 
the transporter also transi>ort the DNA through the nuclear membrane. 

The term "transient" as used in transient transfection, transient 
transduction or transiently transformed relates to the introduction of 
genes into the epidermal cells to express specific proteins, polypeptides 
and RNA wherein the introduced genes are not integrated into the host 
. cell genome and accordingly are eliminated from the cell over a period of 
time. Transient expression relates to the expression of gene products 
during the period of transient transfection. Additionally, transient can 
refer to a stable transfection or transduction into cells, where the cells die 
and are sloughed off from the skin. Thus, the transformed cells are only 
transiently available for the expression of the incorporated genes. 
The term "stable" as tised in stable transfection, stable transduction or 
stably transformed refers to the introduction of genes into the 
chromosome of the targeted cell where it integrates and becomes a 
permanent component of the genetic material in that cell. Gene 
expression after stable transduction can permanently alter the 
characteristics of the cell leading to stable transformation. An episomal 
transformation is a variant of stable transformation in which the 
introduced gene is not incorporated into the host cell chromosomes but 
rather is replicated as an extrachromosomal element. This can lead to 
apparently stable transformation of the characteristics of the cell. As 
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indicated above, in epidermal cellis, which are sloughed from the body 
through the skin, stable transformation can become transient 
transformation because the cells are lost. 

The term "nucleic acid cassette" as used herein refers to the genetic 
material of interest which can express a protein, polypeptide or RNA and 
which is capable of being incorporated into the epidermal cells. The 
nucleic acid cassette is positionally and sequentially oriented within the 
keratin Kl vector such that the nucleic acid in the cassette can be 
transcribed into RNA or antisense RNA and, when necessary, translated 
into proteins or polypeptides in the transformed epidermal cells. A variety 
of proteins and polypeptides can be expressed in the transformed 
epidermal cells by the sequence in the nucleic acid cassette. These 
proteins or polypeptides which can be expressed include hormones, growth 
&ctors, enzymes, clotting factors, apoUpoproteins, receptors, drugs, tumor 
antigens, viral antigens,, parasitic antigens, bacterial antigens and 
oncogenes*. Specific examples of these compotmds include proinsulin, 
insulin, growth hormone, insulin-like growth factor I, insulin-like growth 
factor n, Insulin growth factor binding protein, epidermal growth factor 
TGF-a, dermal growth factor PDGF, angiogenesis factor, for instance, acid 
fibroblast and basic fibroblast growth factors and angiogenin, matrix 
protein, such as, T^ype W collagen. Type VU collagen, laminin, nidogen 
and proteins from viral, bacterial and parasitic organisms which can be 
used to induce an imtnunologic response. 

In addition, the nucleic acid cassette can encode a "transforming 
gene" which encompasses viral oncogenes, endogenous proto-oncogenes 
and activated proto-oncogenes. A variety of oncogenes are known in the 
art. The term, "oncogene" means those genes which cause cancer and 
include both viral and cellular oncogenes, many of which are homologous 
to DNA sequences endogenous to rodents and/or humans. The term 
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oncogene includes both the viral sequence and the homologous endogenous 
sequences. Some examples of transforming genes are listed in Table 1. 



Table 1. 
Transforming Genes 



ABBREVIATION 


NAME 




TToi^TraTr Kov/^nTvia ^^iT^itci 

xiixrvvy ivxuxiiie ocLrconici v iius 


Ki-ras 


Kirsten Murine Sarcoma Virus 


N-ras 


Neuroblastoma oncogene 


fos 


FBJ or FBR osteosarcoma virus 


myc 


Avian MC29 myelocsrtomatosis virus 


$rc 


Rous sarcoma virus 


sis 


Simian sarcoma virus/PDGF p chain 


er&A 


Avian erythroblastosis virus/Thjnroxine T3 
receptor 


erbB 


Avian erythroblastosis virus/Truncated EGF 
receptor 


Jun 


Avian sarcoma virus 17 


p Large T 


Polyomavirus transforming gene 


p Middle T 


Polyomavirus transforming gene 


HPVE7 


Early region transforming gene from human 
papillomavirus 6, 11, 16, 18 


HPVE6 


parly region transforming gene from human 
papilloma virus 6, 11, 16, 18 
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HPVE5 


Early region transforming gene &om human 
papilloma virus 6, 11, 16, IS 


tat 


HIV transforming gene 


ElA 


Adenovirus early region lA 


Sb 


Mutated retinoblastoma gene 


p53 


Mutated p53 anti-oncogene 


WTl 


Mutated Wilms tumor anti-oncogene 


TGF-a 


Transforming growth factor a 


TGF-P 


Transforming^ growth fkctor p 


EGFR 


Mutated epidermal growth factor receptor 


RAK 


Mutated retinoic acid receptor 


VD3R 


Mutated vitamin D3 receptor 


PKC 


Mutated protein kinase C 



The genetic material which is incorporated into the epidermal cells 
using the keratin Kl vector includes DNA not normally foimd in 
epidermal cells, DNA which is normally found in epidermal cells but not 
expressed at physiologically significant levels, DNA nonxially found in 
epidermal cells and normally expressed at phj^iological desired levels, any 
other DNA which can be modified for expression in epidermal cells, and 
any combination of the above* 

The term "keratin Kl vector" or *'HK1 vector" as used herein is a 
vector which is useful for expression of a nucleic acid sequence in 
epidermal cells. The keratin Kl vector comprises a 5' flanking region of 
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the keratin IQ gene, said flanking region including a promoter, a first 
intron and an intron/exon boundary all in sequential and positional 
relationship for the expression of a nucleic acid cassette; a 3' flanking 
sequence of a keratin Kl gene; and a poly-linker. The poly-linker includes 
a plxu*ality of restriction endonuclease sites. The polylinker connects the 
5' flanking region to the 3' flanking sequence and further provides a 
position for insertion of the nucleic acid cassette. 

The sequence for the 3' flanking region of the human keratin Kl 
gene contains regulator elements and is used for preparing the keratin Kl 
vector. It is shown in SEQ. ID No. 1. The keratin Kl vector has a 5' 
flanking region comprising nucleotides 1 to 1246 of SEQ. ID. No. 1; a 3' 
flanking sequence containing regulatory sequences comprises nucleotides 
6891 to 10747 of SEQ. ID. No. 1; and a poly linker comprising nucleotides 
2351 to 2376 of SEQ. ID. No. 2 (the HKl expression vector). 

The keratin Kl vector has a 5' flanking region of approximately 
1.2 kb, an intron and intron/exon boimdary of approximately 1.0 kb and 
a 3' flanking sequence of approximately 3.9 kb. 

The restriction endonuclease sites foimd in the linker and poly- 
linker of the keratin Kl vector can be any restriction endonucleases which 
will allow insertion of the nucleic acid cassette. In the preferred 
embodiment they are usually selected from the group consisting of 
Bam HI, Kpn I, Cla I, Not I, Xma I, and Bgl H. 

One skilled in the art will readily recognize that there are a variety 
of ways to introduce the keratin Kl vector into epidermal cells. The 
vectors can be inserted either in vivo or ex vivo. The mode of insertion 
will, to a certain degree, determine the available methods for the insertion. 
The in vivo insertion is preferred for gene therapy. In this procedure the 
human keratin Kl vector is contacted with epidermal cells for suflicient 
time to transform the epidermal cells. 
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One embodiment of the present invention Includes a bioreactor. A 
bioreactor is comprised of transformed epidermal cells which contain the 
keratin Kl vector. Once the vector is inserted in the epidermal cells, the 
epidermal cells will express the nucleic cassette and produce the protein, 
poliypeptide or antisense RNA of interest. This can be done either i7t vivo 
or ex vivo. Any compound which can be encoded in, and expressed by, the 
nucleic acid cassette can be produced by the bioreactor. 

One method for ex vivo iulroduction of the keratin Kl vector into 
epidermal cells includes a cotransfection of the vector with a selectable 
marker. The selectable marker is used to select those cells which have 
become transformed. The cells can then be used in any of the methods 
described in the present invention. 

Another embodunent of the present inveiition is a method of 
making transgenic aniTnala comprising the steps of inserting the human 
keratin Ki vector into the embiyo of the animal. The transgenic animal 
can include the resulting flninifll in which the vector has been inserted 
into the embiyo or any progeny. The term progeny as used herein 
includes direct progeny of the transgenic animal as well as any progeny 
of succeeding progeny. Thus, one skilled in the art will readily recognize 
that if two different transgenic animals have been made using different 
genes in the nucleic acid cassette and they are mated, the possibility exists 
that some of the resulting progeny will contain two or more introduced 
sequences. One skilled in the artwill readily recognize that by controlling 
the matings, transgenic animals with multiple vectors can be made. 

In the transgenic animals that contain the human keratin Kl vector 
in its germ and somatic cells, the nucleic acid cassette of the said vector 
is only expressed in the epidermal cells. This is a distinct advantage over 
other transgenic flnimfll models where there is not as much control over 
the expression of the sequence in the tissues. 
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In the preferred embodiment, the transgenic animal will cdnt£un an 
oncogene sequence in the nucleic acid cassette. Preferably the animal is 
a rodent. The transgenic animal can be used in any method for 
studjdng a variety of diseases including the origin of cancer, the treatment 
of cancer, interaction of the cancer with the environment as well as for 
looking at drugs, pharmaceuticals and other chemical interactions. The 
transgenic animals are useful in any assay in which the skin cells of the 
animal can be used. 

One specific embodiment of the present invention is a method for 
the enhanced healing of a wound or surgical incision. This method 
comprises the in vivo transduction of epidermal cells with a keratin Kl 
vector. The nucleic acid cassette of said vector contains a nucleic acid 
sequence for a growth factor. 

In the preferred embodiment for the treatment of woimds or 
surgical incisions, a plurality of vectors are introduced into the epidermal 
cells. In the plurality of vectors, the cassette of at least one vector 
contains a nucleic acid sequence for an epidermal growth factor (TGF-a), 
the cassette of at least one vector contains a dermal growth factor 
(PDGF), a cassette of at least one vector contains a nucleic acid sequence 
for a matrix protein to anchor the epidermis to the dermis, and a cassette 
of at least one vector contains a nucleic acid sequence for an angiogenesis 
factor. The sequence for matrix proteins can be selected from any 
sequences useful for the anchoring of the epidermis to the dennis but are 
usually selected from the group consisting of Type IV collagen, laminin, 
nidogeh, and /Type VII collagen. The angiogenesis factor is usiially 
selected from the group consisting of acid fibroblast and basic fibroblast 
growth factors, and angiogenin. The combination of the vectors provides 
all of the necessary elements for quick and rapid enhancement of healing 
of wounds or surgical incisions. This procedin-e is very helpful in the case 
of plastic or reconstructive surgery. Furthermore, skin ulcers can be 
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treated by foUowing similar procedtares as described for wound healing or 
surgical incision. These procedures for healing of wounds, surgical 
incisions and skin ulcers are usefttL in animals and humans. 

In the ex vivo approach for treating or healing wounds, surgical 
incisions and skin lesions, the vectors are first transduced into the 
epidermal cells ex vivo. The transformed epidermal cells are transplanted 
onto the fltiiTnal or human to be treated. 

Another embodiment of the present invention is a method for 
treating psoriasis* In this method, epidermal cells are transduced in in 
vivo with a keratin Kl vector. A nucleic acid cassette in said vector 
contains a nucleic acid sequence for a protein or polypeptide selected from 
the group consisting of TGF-p , a soluble form of cytokine receptor, and an 
antisense BNA^ The cytokine receptor can be selected from the group 
consisting of IL^l, IL-6, andIL-& The antisense RNA sequence is selected 
from the group consisting of TGF-a, IL-1, ILr6, and IL-8. 

In another embodiment of the present invention there is a method 
of treating skin cancer. This method comprises the steps of in vivo 
transduction of epidermal cells with a keratin Kl vector. The nucleic acid 
cassette of either vector contains the nucleic acid sequence coding for 
antisense RNA for the E6 or E7 genes of the hiunan papilloma virus or 
coding^ for the normal p53 protein. 

It has been found that the keratin Kl vector contains a novel 
negative regulatory element in its 3' flanking sequence which can be 
suppressed by Vitamin D3. With the Vitamin D3 regulatory element in the 
vector, the expression of a nucleic acid cassette can be regulated by 
Vitamin a coipmonly used substance in animals and humans. 

The hiiman keratin Kl vector can also be modified by the insertion 
of additional 5' flanking sequences from an 18 kb Ek:o RV fragment to its 
5' end (nucleotides 6090 to 14180 of SEQ. ID: No. 3). The addition of 
these sequences allows the human keratin Kl vector to be expressed 
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exactly like the endogenous Kl gene, that is, post mitotically in cells 
committed to terminal differentiation. Since these cells are programmed 
to die and will eventually slough into the environment, this is another way 
of producing transient expression in cells. 

An additional embodiment of the present invention is a method for 
vaccination comprising the step of in uzuo introduction of a keratin Kl 
vector into epidermal cells. The nucleic acid cassette in the vectors 
usually codes for a polypeptide which induces an immunological response. 
An example of this is the viral capsid from the human papilloma virus. 
One skilled in the art will readily recognize that any other variety of 
proteins can be used to generate a immunologic response and thus produce 
antibodies for vaccination. 

The following examples are offered by way of illustration and are 
not intended to limit the invention in any manner. 

EXAMPLE 1 

Construction and Characterization of a Vector From the HKl Gene 
To target the expression of exogenous DNA to the epidermis a 
vector from the human keratin Kl gene was constructed. Among its many 
uaeSj it is useful in making transgenic animals. 

A schematic showing the structure of the human keratin IQ gene 
is shown in Figure 1. The 12 kb EcoRI fragment containing the entire 
human keratin Kl gene was originally isolated from lambda clone c55 
(Johnson, et al., PNAS, USA, Vol 82, pp, 1896-1900, (1985)). In 
constructing the targeting vector, most of the first exon including the ATG 
was removed, leaving only the 5' non-coding sequences, the first intron 
and the intron-exon boimdaries. In addition, the remainder of the gene 
up to the termination codon was deleted. A poly linker containing the 
following xmique restriction sites (Bam HI, Xma I, Kpn I, Not I, and Cla 
I) was engineered into a site 3' of the first intron to allow easy insertion 
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of exogenous DNA. These manipiilations were performed through, the use 
of polymerase chaia reactions (PCB). The unique EcoRI sites were 
conserved at the ends of the vector to allow easy amplification in pGEM 
vectors and excision for purificatiofi from plasmid sequences prior to 
injection into embryos^ 

The rationale for constructing the vector in this manner was as 
follows* Since the specific elements responsible for the expression 
characteristics of the 12 ld> human keratin Kl fragment have not been 
defined, the entire 5' and iS' flanking regions were included in the vector 
construct. One skilled in the art will readily recognize that as these 
elements are ftrrther defined the flanking sequences can be changed 
accordingly. In addition, sequences within the 3^ non-coding region were 
retained since these may confirm stabifity to transcripts of exogenous 
DNA in epidermal cells. The first intrdn was retained to potentially 
enhance expression efficiency (Brinster, et ai., PNAS, USA, VoL 82, pp. 
1896-1900 (1988). 

EXAMPLE 2 

HKl Expression in Epidermal Keratinocytes 
To assess the htmian keratin Kl targeting vector for exclusive 
expression in epidermal keratinocytes, the P-galactosidase reporter gene 
was cloned into Bam HI and Cla I restriction sites located in the 
polylinker region of the expression vector (Figure 1)> The p-galactosidase 
gene has fi-equently been used as a reporter gene to assess targeting 
spedficity (MacGregor et al.. In: Methods in Molecular Biology VoL 7, pp. 
217-235 (1991). This construct was designated pHKl,p-gaI. Todeteruiine 
if expression of this construct resulted in the production of a functional 
protein, and to determine whether the vector retained cell type specificity, 
this construct was transfected into primary epidermal keratinocytes and 
primary dermal fibroblasts. At seventy-two hours post transfection cells 
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were stained with a solution containing the substrate 5-bronio-4-chIoro-3- 
indoyl-p-galactosidase (X-gal). p-galactosidase activity, indicated by a blue 
coloration^ was detected in keratinocjrtes but not fibroblasts. Thus, 
egression of the HKl.p-gal construct was ceU type specific and resulted 
in the production of a functional protein. 

EXAMPLES 
Transgene Mice 

The same pHKl.p-gal construct utilized in the in vitro studies 
discussed in Example 2 was used in the production of transgenic mice. 
This construct was digested with EcoRI (see Figure 1) and subjected to 
preparative agarose gel electrophoresis to purify the pHKl.p-gal 
expression construct away &om plasmid sequences (pGEM 3) which might 
interfere with expression. The separated expression construct sequences 
were purified and recovered using NA 45 DEIAE membrane (Schleicher & 
Schuell). DNA was precipitated and resuspended at 1-3 ng/ul. ICR 
outbred female mice (Sasco) were given PMS and HCG to stimtdate 
superovulation, mated to FVB males (Taconic) and the resulting early 
fertilized embryos (most preferably on cell stage) were collected from the 
oviducts. DNA was micro-iiyected into the pronuclei and the embryos 
were surgically transferred to pseudopregnant recipient females (the result 
of mating ICR females with vasectomized BJiJF^ males (Taconic)) 

In the initial experiments, 40 mice were bom. In order to quickly 
determine if the pHKl.p-gal tremsgene was being exclusively expressed in 
the epidermis of these mice, these animals were sacrificed at birth. A 
small amount of tissue was removed for extraction of DNA and the 
remainder of the neonate was rapidly frozen in Tissue-Tek O.C.T. for 
frozen sections. PGR analysis was performed on the extracted DNA using 
oUgonucleotide primers specific for the intron within the HKl vector and 
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this demonstrated that 5 of the 40 neonates contained the HKl. p-gal 
constructr 

To assess whether expression of the HKlp-gal construct was 
restricted to the epidermis or expressed in other squamous epithelia, 
frozen longitudinal sections were cut from several PGR positive and PGR 
negative embedded neonates and these were stained with X-GaL Typical 
results are shown in Figure 2 where PGR positive animal, #30^ expressed 
high levels of ^-galactosidase in the epidermis (Figure 2A and a PGR 
negative sibling, #29, was completely negative (Figure 2B), indicating that 
endogenous murine P-galactosida$e was not expressed at sufficient levels 
in the epidermis to catise false positives in this assay. Staining of the 
intestine was observed in both the positive (#30) and negative (#29) 
neonates. This may represent endogenoiis enzyme activity or the 
production of ^-galactosidase by bacteria in the intestine. X-gal staining 
- was detected in the basal compartment, although it is not as intense as in 
the differentiated layers (Figure 2D). Thus, the human keratin Kl 
es^ression vector is also expressed in a substantial niunber of proliferating 
basal cells. 

The most important finding from these initial transgenic 
experiments in that the vector constructed from the human keratin Kl 
gene can target the expression of an exogenous coding sequence 
exclusively to the epidermis of transgenic mice. This specificity of 
targeting can be readily seen in Figure 2A. This low power exposure of 
the skin of #30 demonstrates intenise staining with X-Gal. In addition, 
there are numerous hair follicles and sebaceotis glands in this section 
which are marked by arrows and these do not stain with X-Gal. Keratins 
K5 and K14 are not only expressed in the epidermis, but in all sqtiamotis 
epithelia, including hair follicles and sebaceous glands. The expression 
pattern for keratin K14 (Figure 2C) is revealed by immimofluorescence 
with a specific K14 antiserum of an area firom a consecutive section that 
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is comparable to that in Figure 2A. Note staining of the epidermis, as 
well as, hair follicles and sebaceous glands. If the strategy used in 
construction the human keratin Kl expression vector had altered its 
targeting specificity in transgenic mice, then X-Gal staining would have 
been observed in hair follicles, sebaceous glands, other squamous epithelia, 
and perhaps even other tissue types. However, expression of the HKl.p- 
gal transgene, like the keratin Kl gene itself is restricted to the epidermis. 

EXAMPLE 4 

Regulation of Keratin Kl Vector by Vitamin D3 
A novel Vitamin D3 responsive element was used to modulate 
expression levels in the epidermis. Although all of the regulatory elements 
of the htmian keratin Kl gene have not been identified, a novel negative 
regulatory element &om the human keratin Kl gene (HKl.NRE) has been 
identified and this example demonstrates that it is able to suppress a 
heterologous promoter in response to Vitamin D3. The HK1.NRE is 70 
nucleotides in length (nucleotides 9134 to 9204 of SEQ. ID. No. 1). PGR 
technology was used to generate Bam HI and Bgl II sites at opposite ends 
of this fi-agment. This facilitates generating multiple copies of this 
fragment since ligation and digestion with Bam HI and Bgl II will select 
for oligomers which have ligated head to tail. Four tandem copies of the 
HK1.NRE were inserted into the Bgl II cloning site of pAlO.CAT. In the 
absence of Vitamin D, this construct is highly expressed when transfected 
into primary mouse epidermal cells (Figure 3). The addition of increasing 
concentrations of Vitamin D3 to the culture mediimi completely suppresses 
transcription of this heterologous promoter. This observation indicates 
that the activity of the human keratin Kl expression vector can be 
modulated in the epidermis. The activity of the himian keratin Kl vector 
is suppressed in the epidermis by topical application of Vitamin D3, or an 
analogue, to the skin. 
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EXAMPLBS 

Development of Transgenic Animal Models for Skin Carcinogenesis 
The ability to stably introduce genes into the germline of mice has 
greatly enhanced prospects for generation of animal models of human 
disease (Leder and Stewart U,S. Patent No. 4,736,866 issued April 12, 
1988 and Pahniter and Brinst^r, Ann. Rev. Genet., Vol. 20, pp. 465-499). 
When such genes are combined with regulatoiy sequences that tsurget their 
expression to specific tissues, it provides a model to not only study 
diseases in the context of living organisms, but also in specific tissues 
suspected of being the targets of these genes. Thus, transgenic mice offer 
the possibiUty to determine the influence of factors such as blood supply, 
an intact immxme S3rstem, humoral and cell-<mediated growth controls and 
physical barriers on disease progression. The epidermis is an attractive 
tissue for targeted gene expression; not only is it a modiel for epithelisd 
- diseases in general but the accessibiHty of the epidermis allows easy 
detection of progressive pathological changes that result from transgene 
expression as well as the assessment of the potential role played by 
enviroimiental factors in these processes. In addition^ the prospects for 
utilizing gene therapy to treat cancer are coming closer to reality. 
Therefore, ^niTnal models of human cancers would be useful to assess the 
therapeutic potential of these approaches. The development of animal 
models of skin disease is dependent upon the ability to specifically target 
gene expression to the epidermis. The human keratin Kl targeting vector 
described in Example 1 is ideally suited for this purpose. 

EXAMPLES 

Targeting the v-ras^ Oncogene to the Epidermis 
One family of proto-oncogenes, the ras family (ras"", ras^, 
ras^) has been identified in approximately 20% of himian tumors by virtue 
of specific point mutations at codons 12, 13, and 61 which activate their 
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transfomxing potential. The mechanisxns whereby ras genes become 
activated are currently unknovm but there is widespread evidence that 
environmental agents play pivotal roles in the etiology of ras mutations. 
To date few studies have undertaken to study ras activation in hxmian 
skin malignancies. However recent reports have identified ras^ activation 
in basal and squamous cell carcinomas appearing on sun exposed body 
sites, interestingly at potential p3n*imidine dimer sites possibly derived 
from skin exposure to UV irradiation. In the mouse skin model of 
chemical carcinogenesis where the three distinct stages of initiation, 
promotion and malignant conversion have been defined, ms*^ activation 
has been found in benign squamous papillomas, the end point of initiation 
and promotion suggesting an early role for ras^ in skin carcinogenesis. 
Taken collectively, the above experimental evidence suggests the 
importance of developing an animal model to further study the mechanism 
- of 7Tas"*-induced skin carcinogenesis. Toward this end, the sequence 
encoding the v-ttos"* protein of Harvey Miuine Sarcoma Virus (Dhar, et al.. 
Science, VoL 217, pp. 934-937, (1982) was cloned into the Bam HI and Cla 
I sites of the hxmian keratin Kl expression vector (Figure 4) To 
discriminate expression of the v-ras^ transgene from that of the 
endogenotis ras gene, a sequence encoding the hxmian keratin K6 epitope 
SEQ. ID. No 4 was engineered onto the 5' end of the w-ras^ cassette. 

HKl ras transgenic mice exhibit the following phenotype: 1) 
Newborn transgenic mice expressing v-ms"" (HKl ras) exclusively in the 
epidermis show distinct wrinkled skin at 48 hours and are smcdler than 
litter mates. 2) Juvenile HKl ras transgenic mice exhibit progressive 
keratinization which peaks at 14 days. 3) The histotype of newborn HKl 
ras mice reveals massive epidermal hyperplasia with up to 20 fold 
thickening of the epidermis. 4) By day 14 this progresses to massive 
hyperkeratosis. Both histotypes are pre*neoplastic, papillomatous, non 
djrsplastic and exhibit few appendages. 
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The HKl ins transgeiiic mice develop benign tumors, laical 
lesions appear within 10-12 weeks at single sites. The histotype of these 
tumors reveals a well differentiated squamous papilloma. Papillomas 
often appear at sites after wounding. Many of these papillomas are prone 
to regression. This regression phenomenon suggests that ras^ alone is 
insuf&cient to -mfliTi fain even a benign, phenotype and requires further 
events which may involve roles for additional oncogenes/antioncogenes. 

EXAMPLE? 

Targeting the fbs Oncogene to the Epidermis of Transgenic Mice 
Kecent iri vitro studies have shown that the v-/bs gene can convert 
to malignancy primary keratinocytes or papilloma cell lines which 
expressed an activated ras"" (Greenhalgh, et al., PNAS, USA, VoL 87, pp. 
643-647, (1990); Greenhalgh and Yxispa, Mol. Carcinogen., Vol. 1, pp. 134- 
143 (1988). This suggested that fbs could play a later role in epidermal 
carcinogenesis and cooperate with the benign phenotype imparted by 
activated nzs-"* expression. Although this alone was sufQcient to initiate 
the establishment of HKl fos transgenic mice with a view to mate with 
HKl ras mice» two further studies have identified a role for fjs in normal 
epidermal differentiation and thus highlights /bs as an attractive target for 
perturbation. Using a c-/bs/P-gal fusion gene Ciirran and co-workers 
(Smeyne et al, Neuron, VoL S, pp. 13-23 (1992)) have shown significant fos 
egression in the differentiated layers of the epidermis and CFisher, et al.. 
Development^ VoL m, pp. 253-258, (1991)) have localized e-fos expression 
to a specific subset of granular cells. Thus may have an important role 
in the control of the final stages of keratinocyte differentiation. The 
putative perturbations of this normal role for c-fos in such specialized cells 
by V'fos can only be explored in the context of targeted expression in 
transgenic mice^ In addition the c-/bs proto-oncogene is known to function 
as a transcriptional regulator in conjunction with the c*jun/APl gene 
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product and thus, while targeting ras^ represents studies of men:ibrane 
signalling on neoplasia, targeting fos explores the role of transcriptional 
control on this process. 

Thus, the fos protein coding sequence from the FBJ/FBR chimeric 
v^fos plasmid pFBRJ was inserted into the htiman keratin Kl targeting 
vector (Figure 5). To discrinunate expression of the v-fos transgene from 
that of the endogenous fos gene, a sequence encoding the human keratin 
Kl epitope (SEQ. ID. No. 6) was engineered onto the 5' end of the v-fos 
cassette. 

HKl fos transgenic mice exhibit the fohowing phenotype: 1) A 
specific ear phenotype typically appears at 3-4 months initially in the 
wounded (tagged) ear and then becomes bilateral. 2) In several animals 
expressing severe phenotypes, the wounded ear lesion can grossly resemble 
a benign keratoacanthoma. 3) Alopecia and hyperkeratosis of the axilla 
often develop in older animals (approximately 1 year of age). 

The histotypes of the HKl fos mice are as follows: 1) The 
histotype of the initial ear lesions exhibits hyperplasia and h3^erkeratosis, 
a pre*neoplastic pathology with few dysplastic cells and Uttle evidence of 
further neoplastic progression. 2) At later stages the massive 
h3^erkeratotic histotype resembles a benign keratoacanthoma. 

Three HKl fos transgenic mice lines have been established which 
develop an obvious pre-neoplastic ear phenotype at 3-4 months. The 
promotion stimulus derived from woimding (i.e. ear tag) appears to 
accelerate the appearance of this phenotype which eventually becomes 
bilateral. Also, it appears that friction in the axilla and inguinal area may 
also promote a pre-neoplastic hyperplastic/hyperkeratotic response after 
a significant latent period. Collectively these data support a fimdamental 
role for the fbs gene in normal keratinocyte differentiation and 
perturbation by v-^5 results in pre-neoplastic differentiation disorders. 
In several HKl fos mice severe ear lesions appear to progress to resemble 
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benign keratoacanthomas. Although numbers are low at this time, that 
this is the resultant tumor type is consistent with a role for fos in the 
latter stages of terminal differentiation, and low numbers and latency 
suggest a requirement for additional events. 

EKAMPLE8 

Targeting HPV 18 E6 and E7 Gene Expression to the Epidermis 
There is widespread evidence &om clinical and epidemiological 
studies which implicate hmnan papiHoma viruses (HPV) in the etiology of 
certain squamous epithelial tumors in humans. HPV's have a specific 
tropism for squamous epithelial ceUs and different types of HPVs have 
specificily for the anatomic site that they infect. Additionally, within a 
specific subgroup of HPVs^ certain types are associated with development 
of either benign (e.g. HPVe and 11) or malignant (e.g., HPV-16 and 18) 
disease and this may center on the properties of the E6 and E7 genes. 
"Dirough adaptation to the differentiation programs of the epithelia that 
they infect, HPVs have evolved a clever strategy for the production of 
infectious progeny. HPVs infect basal epithelial cells but do not xmdergo 
lytic replication in this compartment, thus, the germinative pool of cells 
is not subjected to the cytopathic effects of late viral gene expression. 
Production of virus only occurs in terminally differentiated cells that have 
lost proliferative potential and wiE be desquamated into the environment. 
This strategy not only provides for the spread of mature viral particles, 
but ensures their continuous production by replenishment with cells &om 
the basal compartment. Since the life cycle of the virus is so tightly linked 
to all stages of differentiation of squamous epithelial cells, establishment 
of successful ctilture systems has been difScult. To date, these host 
factors^ coupled with regulatory mechanisms present within papilloma 
virus genomes themselves have also hindered attempts to observe 
pathological effects of HPV gene expression in squamous epitiielia in 
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transgenic mice. These restrictions on utilization of the transgenic mouse 
model have been overcome with the ability to specifically target HPV gene 
expression to sqxiamous epithelia using the himian keratin Kl targeting 
vector. In the example provided, the coding sequence for the E6 and E7 
genes of HPV 18 were inserted into the htiman keratin Kl targeting 
vector at the Bgl II and Cla I sites (Figure 6), 

HKl E6/E7 mice exhibit the following phenotypes and histotypes: 
1) One mouse exhibited a subtle skin lesion at 7 months characterized by 
skin rigidity, thickening and roughness \mderl3dng the fur which later 
progresses to a wart like structure by 10 months. 2) Thehistotype of this 
lesion exhibit^ hyperplasia, hyperkeratosis and the beginnings of 
verrucous formation. 3) The histotype of a lesion from another mouse at 
11 months, is characteristic of a typical wart induced by HPV. 

To date three HKl E6/E7 transgenic mouse lines have been 
- established which develop HPV-like lesions with low frequency and long 
latent periods. At this time it is unclear whether this limited appearance 
of phenotypes reflects the subtle nature of the lesion, and its requirement 
for a long latency period, or the complex nature of HPV biology. However 
it is noteworthy that our result is consistent with the epidemiology of 
HPV infections in humans, e.g. although a large percentage of a given 
female population can test positive for cervical infection by HPV 6,^11, 16, 
and 18 relatively few progress to develop overt lesions. It may be 
therefore that the apparent delay and low phenotype frequency exhibited 
by these mice provides a relevant background to study the consequences 
of HPV expression during epithelial differentiation. In addition, these 
mice can be useful in assessing the efficacy of novel antisense 
pharmaceuticals which have been designed to inhibit expression of the E6 
and E7 genes of HPV 18. 
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EXAMPLES 

Production of Transgenic Mice Expressing TGF-a in the epidermis 
Transforming growth, factor alpha (TGF-a) is a cytokine with 
structural and functional characteristics similar to epidermal growth 
fector (EGF). Both TGF-a and EGF hind to the epidermal growth factor 
receptor* (EGF-R) and stimulate the tyrosine kinase cascade. TGF-a is 
expressed by both normal and transformed cells and causes proliferation 
of cultured keratinocytes. In vivo, TGF-a induces angiogenesis and is 
more potent than EGF in accelerating woimd healing. In normal human 
skin, expression of TGF-a occurs in all layers of the epidermis and in 
certain areas of the appendages^ Several cutaneous diseases such as 
psoriasis, squamous cell carcinoma,, and congenital bullous ichthyosiform 
erythroderma have been associated with altered expression of TGF-a. 

To determine whether altered expression of TGF-a plays a role in 
the pathogenesis of these diseases^ the protein coding sequence of human 
TGF-tt was inserted into the human keratin Kl targeting vector (Figure 
7). Injection of the BKl.TGF-a construct into embiyos resulted in 
phenotypic fotmders that were quite similar to that of ras"" , The 
histoiype was also similar with epidermal hyperplasia, hyperkeratosis and 
relative alopecia. One founder (2 1/2 months of age) has developed 
multiple papillomas. Histologically these appear to be squamous 
papillomas. To date, none of these lesions have converted to a malignant 
phenolype, 

ESAMPIiE 10 

Production of Transgenic Mice Expressing the HIV tat gene in the 
epidermis. 

Patients infected with the human immunodeficiency virus (HIV) are 
at high risk for the development of specific AlDS-associated cutaneous 
disorders. Often patients manifesting symptoms have skin lesions ranging 
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from hyperproliferative conditions such as psoriasis to Kaposi's sarcoma 
and metastatic basal cell carcinoma. The precise role of HIV genes, the 
cells of origin and hence etiology of such skin lesions remains xmknown. 
It may be that specific HIV genes, e.g., the trans-regulatory protein tat, 
play a role directly or indirectly on the homeostatic mechanisms of host 
cells and tissues. Alternatively, the HIV tat gene may interact with or 
activate other viral genes present from latent or opportunistic infections, 
e.g., human papilloma virus (HPV). To directly assess the role of 
keratinocytes in the development of AIDS-associated cutaneous disorders, 
the HIV tat gene is targeted to the epidermis of transgenic mice. 
Targeting of the tat gene and exclusive expression in keratinoc3^es is 
achieved by the use of the hiraian keratin Kl vector (Figure 8). The 
development of strains of mice which develop cutaneous lesions with 
predictable kinetics as a resxilt of expression of the HIV tat gene alone or 
in combination with other oncogenes serves as a useful model for 
assessing therapeutic potential of antisense pharmaceuticals designed to 
inhibit expression of the HIV tat gene. 

EXAMPLE 11 

Utilization of the HKl Vector for Gene Therapy AppUcations 
Where exclusive expression in epidermal cells is desirable and for 
transient expression the HKl vector is an excellent choice for gene 
therapy. Unlike the human keratin Kl gene itself, the human keratin Kl 
vector derived from the 12 kb fragment is expressed in proliferating basal 
cells in the epidermis. In more recent transgenic experiments, it has been 
determined that a larger fragment containing the hxmian keratin Kl gene, 
a 18 kb Eco RV fragment (Shown schematically in Figure 9), is expressed 
exactly like the endogenous mouse Ell gene, i.e. post mitotically in cells 
committed to terminal differentiation. These cells are programmed to die 
and will eventually slough into the environment. Therefore, for human 
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applications where transient expression is desired, it is possible to design 
a vector that wiU only be expressed in cells after th^r commit to terminal 
differentiation and begin moving iipward toward the outer layers of the 
epidermis. The vector will be expressed approximately 10-14 days prior 
to being shed inta the environment. This can be accomplished by 
mserting additional 5' flanking sequences from the 18 kb Eco RV 
fragment onto the end of the original huinan keratin Kl vector (See Fig, 
10). 

EXAMPLE 12 
Detection of Carcinogens and Tumor Promoters 
Short-term tests (STTs) for genotoxic chemicals were originally 
developed as fast, mejspensive assays to assess the potential hazard of 
chemicals to htunans. However, a recent report summarizing the restilts 
of a project initiated by the National Toxicology Program to evaluate the 
ability of STT's to predict rodent carcinogenicity questions the validity of 
relying solely on STT's. Three of the most potent carcinogens, detected 
in the rodent assays, produced no genetic toxicity in any of the four STTs 
evaluated (Tennant et al.. Science, Vol. 236, pp. 933-941, (1987). Thus, to 
receive EPA/FDA approval for new compounds, chenucal, agricultural, 
food and drug companies are currently required to perform two year 
oniTTiaT tests costing up ta $2 million. The development of new transgenic 
strains of mice that have been genetically engineered to rapidly detect 
carcinogens and tumorpromoters would substantially reduce the overhead 
cost of long-term animal studies. The suitability of the traiisgenic mouse 
lines claimed in this patent application for rapid detection of carcinogens 
is initiaEy determined with a known skin carcinogen, DMBA, to detenmne 
whether benign lesions appear earlier than in control non-treated litter 
mates. Ta determine suitsi)ility for detecting tmnor promoters, a known 
promoter, 12-0-tetra-decanoylphorbol-13-acetate (TPA) is applied to ros'^, 
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and fos mice. Since benign lesions in ms"* mice and hyperplasia in fos 
mice appeared at sites of woxmding (i.e., tagged ears), and wounding can 
promote timior formation, these lines are useful for these studies. 

All patents and publications mentioned in this specification are 
indicative of the levels of those skilled in the art to which the invention 
pertains. All patents and pubUcations which are incorporated herein by 
reference are incorporated to the same extent as if each individual 
publication was specifically and individually indicated to be incorporated 
by reference. 

One skilled in the art will readily appreciate that the present 
invention is well adapted to cany out the objects and obtain the ends and 
advantages mentioned, as well as those inherent therein. The bioreactors, 
nucleic acid sequences, transformed epidermal cells, transgenic flni-mfllfi 
and human keratin Kl vector, along with the methods, procedures, 
treatments, molecules of specific compoimds, are exemplary and are not 
intended as limitations on the scope of the invention. Changes therein 
and other uses will occur to those skilled in the art which are 
encompassed within the spirit of the invention as defined by the scope of 
the claims. 
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SEQUENCE LISTING . 

(1) GENERAI, INFORMATION: 

(i) iVPPLICANTt Rdppr Dennis R* 

Rcthnagel, Joseph A. 
Greenhalgh, David A. 
Yuspa, Stuart H* 

(ii) TITLE OF INVENTION: DEVELOPMENT OF A VECTOR TO TARGET GENE 
EXPRESSION TO THE EPIDERMIS OF TRANSGENIC ANIMALS 

(Hi) NUMBER OF SEQUENCES: 5 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fulbright $i Jaworski 
(B> STREET £ 13Q1 McKinneyr Suite 5100 

(C) CITY; Houston 

(D) STATED Texas 

(E) COqNTRT? U.S.A^ 
iry ZXPz 77010-3095 

(V) COMPUTER READABLE FORM? 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM t PC-DOS/MS-DOS 

(D) SOFTHARE: Patentin Release #1.0/ Version #1.25 

Cvi) CURRENT APPLICATION DATA: 

(A> APPLICATION NUMBER: US 

(B) FILING DATEt 

(C) CLASSIFICATION: 

(viii) ATTORNEY/AGENT INFORMATION i 

(A) NAME: Paulr Thotaas D. 

(B) REGISTRATION NUMBER: 32,714 

(C) REFERENCE/DOCKET NUMBER: D-547a 
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(Ix) XEZ.ECOMK17NICATION INFORMATION: 

(A) TEI-EPHONE5 713/651-5325 

(B) TELEFAX: 713/651-5246 

(C) TELEX: 762829 



(2) INFORMATION FOR SEQ ID NOtli 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10747 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : dotible 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 



GAATTCGGCT GCTGTGCTGT CGTACAACAT GCTGTTTAGG ATCTTGCACA TGATAGCTAG 60 

GTATTCTTGC TTCAAATCGC AGGCACCCCA CTTACCAACT GTGTAGACTT GATCACGTTA 120 

TTCAACCCCT 6TGTCTCTGC TTCCTCATTT TACAAATGGG GAGAAAAATA GCATCTATCT 180 

CAAA6TTGTG AAAATTAAGC AAGTTAATAC ATATGTGCTA CGTA6AACAG TGCCTGGTAC 240 

ATGGTCAGTT TTTGATACAT GTTAGGTATT ATCATTATTA TCACCTCCAG AAACAATTTA 300 

AACTTCTCAT ATAAGGCTCT CCAGACACCT CTCATTGTCT TCCCTTCCAA ATCTGCATTT 360 

ATCTCTCTCT CTTTGCAOTC CAGTGTGAGG CTTGAATCAC CTATCAAGCC TCACCTCCAC 420 

CCCTGTGCTT TACAAAATGT CCTAGAGCTT CTATTTACTC GTCTCACTGC TCTGTGGGCT 480 

TTTTCACTCA AGG6CGTTTG CATGCTATCC ATTGCTACCT GTTTTCTGTT GCT6GTGTCT 540 

GTCTCCTGCT CTATCTTTGA AGAAAAGAAA CAAGAAAAGG AATAACTGAG AAACAGAGAA 600 

AAAAAATGTC TCTCCCTTCT GGTTCTTCCA GACCACCCAC TCATCCATCT TGTTCAATGA 660 

CAGCTTCTCT TCCTTTAATT AATCACTGTG GTATATTTAT AAAGCTTATA TTTATGAAAG 720 

ACCTTTTAAT TTTTTAGTTA 7TAAAGCCCT TTCTCTTTGT CAGGTTGTAA CTGAGTGAGC 780 

TCTGGAGTTT GGAAAGAAGA TCTTAGAAAT GGGCCAGAGA GCTCCTTCTG AGATCCAAGC 840 

ACGGAGAATT GCACCTGCTG TGCATGGTAA GA AGTGTGC TTGGTA CTC ACAAGGGCAA 900 
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GGTGAGAATA GAAACTXTCA TGCCTTTTTG ATGGGGGTTA TGAAATCCTA CCAAGAAACA 960 

CCAGGTATCA GATGT6GGGT CCTGTTTTCC CAAAGCCACA AATGCTTGAA GGAAGATCTT 1020 

GTGTGATAAA ATAATTACCA CAT6AACCAA TCTTGCATGC ACAGCAATM? TGAGAGCCCA 1080 

TCCTGGGAGC TAGGTGlTGXA GTGTTTATCG TATTGTTGAG GCTCGTAAAA ATCTTGTATG 1140 

GCT6CAGGCA AGCCAAACCC TTGACAGGCA CTGCATCTCC GCXGACTCTA GAAGACCAAG 1200 

CCCAATTTCT TCCCTGTATA TAAGGGGAAQ^ TCXCTATGCT TGGGGTAGAG GAGTGTTTAG 1260 

CTCCTTCCCT TACTCTACCT TGCICCTACT TTTGTCTAA& TCAACATGAG TCGACAGTTT 1320 

AGTTCCAGGT CTGGGTACCG AAGTGGAGGG GGCTTCAGCT CTGGCTCTGC TGG6ATCATC 1380 

AACTACCAGC GCAGGACGAC CAGCAGCTCC ACACGCC6CA GTGGAGGAGG TGGTGGGAGA 1440 

TTTTCAAGCT GTGGTGGTGG TGGTGGTAGC TTTGGTGCTG GTGGTGGATT TGGAACTCGG 1500 

AGTCTTGTTA ACCTTGGTGG CAGTAAAAGC ATCTCCATAA GTGTGGCTAG AG6AGGTGGA 1560 

CGTGGTAGTG GCTTTGGtGG TGGTTATG6T GGTGGTGGCT TTGGTGGTGG TGGCTTTGGT 1620 

GGTGGTGGCT TTGGTGGAGG TGGCATTGGG GGTGGTGGCT TTGGTGGTTT TGGCAGTGGT 1680 

GGTGGTGGTT TTGGTGGAGG TGGCTTTGGG GGTGGTGGAT ATGGGGGTGG TTATGGTCCT 1740 

GTCTGCCCTC CTGGT6GCAT ACAAGAAGTC ACTATCAACC AGA6CCTTCT TCAGCCCCTC 1800 

AATGTGGAGA TTGACCCTGA ^ATCCAAAAG CTGAAGtCTC GAGAAAGGGA GCAAATCAAG 1860 

TCACTCAACA ACCAATTTGC CTCCTTCATT 6ACAAGGTGA GTTTCTCTCT CATTGCACTG 1920 

GTAGGGCTGC CGCT6GTCCA CTTGGGATTG GTGCAGTGAA AACACATGTA GGTTTGAACC 1980 

TCAAGTTTCQ ATGTTTACAT GATTAAAAGG ATGTTTTGTG GAATGGTCTC CTAGGAGATA 2040 

TGTTAGAT6T ATGCTTGTGA ATGGTGTTAA TGACTCTCTC TTTGACAAAG GGTTCGTGGT 2100 

CGACCTAAAG GTGGGTCA6T GTGACATTAA CATTTAAGTG CTTTTTATTC AGCTCTTGAG 2160 

CGGAATTGGG ACTCATATCT GTTGAATGAA GATAATAGAA ATGGGGCTAA CTGAACTTTC 2220 

CAGGGTGCAA GTGAGAACCC TGGAAAGGTC TTCCTAACCA TAGAAAG6GA GTTGAGTGTG 2280 

AACATAGTAT AGAGTGTTAT TGTAGCAGAA AACATGTGGT CAGTCAGTGC CAAACATCTT 2340 

TTGCTGTCAG AGGGGAGCTC TGCCTTCTAA TAATTTTACA TTGGTACTGG ATGAGGCTAG 2400 

AGTTTTTTTA TACTAATATC TCCAAAAATC AGCTCTAAAA AACTCAGATA AACCATTTTT 2460 

TTAATTTTTT GCTTAATCAT TAATAGTGCC AATCCAAGGT TATCCACAAC AAATTTCAAA 2520 

TCCAATTTTG AATTTTCCTG ATATACTTTT GAAATGTGTG TGTGTCCT6G GGATGCAAAC 2580 

CAGTTTTTAT GGTAATATAC CTAACAAAAT TTTGGAAGGC AAATCTCTTA AATACCATGC 2640 

ACCTATTTCA AAACATAATT GCAATAATTC TGTATGCGCT TTGCTATTGG TATTTGTTTA 2700 

GTTACTCCCT TCCAA6CCCT CTCTGAATTA ACAAGTTGGG TTTTATTATG CAGATGATAT 2760 

TAACTTGATC ATCTTCTTCC TllTTTCTCTG TCATG6TCAG AAGATAGGAA TTGAG6TTCT 2820 

TTTCCAAATG AGGCACAGTT CTCCATGGCT ATGAGACTCX! ATTTATGCAT CAGGAGTAAA 2860 

GGGGTCTTGT GTITTTAGGT 6A6GTTCCTG GAGCAGCAGA ACCAGGTACT GCAAACAAAA 2940 

TGGGAGCTGC TGCAGCAGGT AGATAQCTCC ACTAGAACCC ATAATTTAGA GCCCTACTTT 3000 

6AGTCATTCA TCAACAATCT CC6AAGGAGA GTGGACCAAC TGAAGAGTGA TCAATCTCGG 3060 

TTGGATTCGG AACTGAAGAA CATGCAGGAC AT6GTGGAGG ATTACCGGAA CAAGTAAGGG 3120 

ACCCTGTCTG GGCAGTTCTT AACTTTTGCT GTAAAAGAGT TCCAGAAAGT AATAAGCTAA 3180 

6ATCATGAAG CA6CATGTAG CTATGTCTTT TCTTAGGTTA GAGGCACATC AGTTTGACAT 3240 
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TTTCAGAAAT CTTCATTTtC TCAGGAGAT6 GAAATAGTCT AGTGGTTTTA TTGCTCAGTA 3300 

GAAAGTAGTG GCCAATATGT CCTAGGTTCA TAATAGAAAG GCAGTGATAG GCAATGCCAC 3360 

CTTTAGTTTA GAATGCTGGA CTTCAGGTCT TACCACCTCT GAATCTCCTA ATTGTTTCTG 3420 

CTTTCCTGCA GGTATGAGGA TGAAATCAAC AAGCGGACAA ATGCAGAGAA TGAATTTGTG 3480 

ACCATCAAGA AGGTAAGCAA ATTCTGTAGG ACX^AACTCA CATTTGAAAT AAATAAGGGA 3540 

AGAGGGTCTC CAATTACTAA GCAGAAAGCA GCCATGATAT GGAGAGCCAG GTAGTA6ACC 3600 

TGGGGAGTAT ATGGAGTGGG GCTATATTTT TCACATCATC ATGGACCTGG ACTGATCCAG 3660 

GCACTTGGCT TCTCCATATT TCCCAGCACC TTACATAGTA AGTGGAGTGG CAGATTCTCA 3720 

GCAAGCCAGG CACACTCCCT TGAXGGtGCT ATCCGGGGGT GGGACAGTTA GGGAACTGTG 3780 

ATTTACCTGG GGCAAAAAGG AGTGGAGTAG ACCCAAAGCT CCTTTTTTTG CTTGGAGAAT 3840 

CCCCTCACAG GTAATGAGAG GGACCTGCCC TG6AGA6AAC GTGCCTTCAT GATGTCCCTT 3900 

GTTCCTCTAG GATGTGGATG GTGCTTATAT GACCAAGGTG GACCTTCAGG CCAAACTTGA 3960 

CAACTTGCAG CAGGAAATTG ATTTCCTTAC AGCACTCTAC CAAGCAGTAA GTCTTCCAGX 4020 

TTCAACCAAG TTTAXCTAAA TGGAGAGTTT TTAAGCCGGA ACCCACAACG ATTCAGAAGA 4080 

ATAGATATTT ATCTTTTATT TCCTGACTGC TTTCTCTGTC TAAGTTGTTT TTTGTTTTAG 4140 

TGCTGTAAGA GTCACTAACC TATTATGTCT TGCAGGAGTT GTCTCAGATG CAGACTCAAA 4200 

TCAGTGAAAC TAATGTCATC CTCTCTATGG ACAACAACCG CAGTCTCGAC CTGGACAGCA 4260 

TCATTGCTGA GGTCAAGGCC CAGTACGAGG ATATAGCCCA GAAGAGCAAA GCTGAGGCCG 4320 

AGTCCTTGTA. CCAGAGCAAG GTGAGTGGGC TGAAACCGTA GCCAGTTTCC CTGAAATGGC 4380 

TTGTCTTGCT ATCCTGTGTT ATCTCATGTA TGTGTGCCTG TGCCATGCTG AGTTCTGCCT 4440 

ACATTTAACA AACX;CTATCT ACCATCTTTA GtATGAAGAG CTGCAGATCA CTGCTGGCAG 4500 

ACATCGGGAT AGTGTGA6AA ATTCAAA6AT AGAAATTTCT GA6CTGAATC GTGTGATCCA 4560 

GAGACTTAGA TCTGAAATCG ACAATGTCAA GAAGCAGGTA TGTGCTTTCT CCTTCTACCA 4620 

CTCAGCTGTA TGGAA7GGGG GTAACCCTCA GGTAAAGGGC GAGTGCTTTC CTAGTTTTGA 4680 

ATCTTGCAAT TCAQCCCAAG GCTACATTAT TAGCCCTGGT TCCTTTTCTG ACTATGCTAG 4740 

TTTCCAGAAT GCAGCCATCA TGCTGGGTTC TCTTTAGGGA AATCTGTGAG AATGGCCTAG 4800 

TAGAGAAAGA TGGGATGGTC AATGTGAGTG ATCTAGCCTA TGACCCAAAG TGGACTTAAG 4860 

AGTTGGGGAG TGAGAGGAAG GGCAGCCAGG AGGTTTTAGA GTAGGTGTTT AGAAGAATGT 4920 

CAAGTCTGTA AGGGTTGTAG GAGCCTTGAC TCAGGGCCAA GAGAGGCXGT TGAGXXAXCC 4980 

CXAA6GXCXX XXAAGGAAGX CAACAXGGXG AXGXGXXAXC XGGAGGXGGG XGXGAGAXGA 5040 

CXXAA6GCCA AGXGGXXCXG XXGGACXCAX XAXXGGCCXC ACXGGAGXGG GGAGACCAAX 5100 

XGGGAXGAGG AGGCCTAGXG GGGAAXGCAX AXXAXGAGAG GGXGXCAXAX CXXXXXCAGA 5160 

XCXCCAACXX GCAGCAGXCC AXCAGXGAX6 CAGAGCAGCXi XGGCGAGAAX GCCCXCAAGG 5220 

AXGCCAAGAA CAAGCXGAAX GACCXGGAGG AXGCCCXGCA GCAGGCCAAG GAAGACCXGG 5280 

CCCGCCXGCX GCGXGACXAC CAGGA6CXGA XGAACACCAA GCXGGCCCXG GAXCXGGAGA 5340 

XXGCCACCXA CAGGACCCXC CXGGAGGGAG AAGAAAGCAG GXGAGGAAG6 GACGCXGGGA 5400 

GXCGAACCXC XXCXCAXGGX CXXCCXXCCX XGCAAGCXGA XXGXXGXXGA AGAXGCAGCC 5460 

AXCXGAXXGC AGCXXGX6CX GGGXAXGGGG AAAXGAAAAG XACACGGAGC AGGAGGAAG6 5520 

ACCXAGXXXX ACXTXGGGAG CXGGAGXCCC AAGCXGXXXA XXXXXXXCXX CXAGGGCXGX 5580 
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AACATATCTA GAAAGAGCTT TGAGGTG6A6 CAAATXATTC TXTATCTGGG CXGCCTCAGA 5640 

TGGCAGCTGG CCTAAAGTCG GCATCTTTAG AGGGGGCCTT CATTGGCTGC AAGGCTCGTC 5700 

TCGTTTATAT GGGAATTTCT CCGTGTTTGT ACTCTTGCTG AGAAAAAATG ACAGGTCTGG 5760 

GAGGCCAGAG GGGATTGGAT TAAGTTTCAG ATTAAGTGCA TTGGAGAAGA CCCAGATGGG 5820 

6AAAGTCTTC AAGGT6GT66 AGCGGGGAAT dGGGAAGCGG TTTGGGAAGC TGGAGTGTCC 5880 

TGAGGAATTT TCTTATTTTC TCCTACAGGA TGTCTGGAGA ATGTGCCCOG AACGTGAGTG 5940 

TGTGTAAGTA CAAGTCGATT TCTCAG0G6C ATGTCCA6GC TTTGTT6GGC TGGAAACGGA 6000 

GTTGAGGTTG AAAATAACTG AGCTTCCXCT TGCAGCXGTG AGCACAAGCC ACACCACCAX 6060 

CAGXGGAGGT GGCAGCCQAG 6AGGXGGCGG OGGXGGCXAC GGCTCXGGAG GXAGCAGCTA 6120 

XGGCXCCGGA GGXGGXAGCX AXGGXXCX6G.AGGXG6C6GC GGCGGCGGCC GXGGCAGCXA 6180 

XGGCXCCGGA GGXGGCAGCT AXGGCXCXGG AGGXGGCGGC GGCGGCCAXG GCAGCXACGG 6240 

CXCGGGAAGC AGCAGXGGGG GCXACAGAGG XGGCXCXGGA GGCGGC6GCG GCGGCAGCXC 6300 

XGGCGGCCGG GGCXCXGGCG GCGGGAGCXC XGGAGGCXCC AXAGGAGGCC GGGGAXCCAG 6360 

CXCXGGGGGX GXCAAGXCCX CTGGXGGCAO XXCCAGG6XG AAGXXXQXXX CXACCACXXA 6420 

XXCCGGAGXA ACCAGAXAAA GAGAXGCCCX CXGXXXCAXX AGCXCXAGXX CXCCCCCAGC 6480 

AXCACXAACA AAXAXGCXXG GCAAGACCGA GGXCGAXXXG XCCCAGCCXX ACCGGAGAAA 6540 

AGAGCXATGG XTAGXXACAC XAGCXCAXCC TAXXCCCCCA GCTCXXXCXX XXCTGCXGXX 6600 

XGCCAAXGAA GXXXXCAGAX CAGX6GCAAX CXCAGXCCCC XGGCXAXGAC CCXGCXXXGX 6660 

XCXXXCCCXG AGAAACAGXX CAGCAGXGAC CACCACCCAC AXGACAXXXC AAAGCACCXC 6720 

CTXAAGCCAG CCAGAGXA6G AdCAGXXAGA CCCAGGGXGX GGACAGCXCC XXAGCAXCXX 6780 

AXCXCXGXGC TGXXXXGGXX XXGXACAXAA QGXGXAAGCA AGXXGXXXXX CXXXXGXGGA 6840 

GAGGXCXXAA ACXCCCCAXX XCCXXGXXXX GCXGCAAXAA ACXGCAXXXG AAAXXCXCCA 6900 

XGXCXCGAXC GGCCXXGXXX ACGGCACXGX CXAACCXGGA XGGGXGXXXX GXGAGGXAAA 6960 

AGAAGACACX AGAGCCACAT GGCAXAXGGG AAAGXCAXGfC ACACAAACAX GAGAAAAAXG 7020 

CAGAGGCCAA CCAGGGAACA XXXCACCAGA CXGGAAXCAC AGAGAGAGCA AGCACXXXCC 7080 

CAGAXGGXGG GGAXGXCAXG GAGAAAXGGA GAGAGCGGGX GACAGGXXXX GXXCAXXXGA 7140 

GAAGGCXXXC XXGAAAAGGG CAGXGAGCAA GCAGGXX6GG AGGAAGAGGX GXGGCAXXGA 720O 

GAA0AA6GGA AAGXATTGCA XGAAAAAGXA AXXCXXCACG XGGAACAGCC AGXAAGGAGG 7260 

GGCAXGAGXA ATAXAGGGXC AGCAGXXACX GGAGCCAGAA XACAGACXXX GGCCXGGGGA 7320 

GXXCAAGAAC XAAGAGXGGX AAXA6AGAGX XGGAXAXXCC AXXXCCCXXC XCXXXXXGXG 7380 

CCACCACCCA AAGCXCXGCA XAAXCXAA6A AGXXGCCXXG XXGACACAXA GCXGAXACXX 7440 

GXGAAGXXGX ACAACAGGAX AGCAXAGXGG CCAGAAGCAX GGACAGXXGA ACXCAGAXAX 7500 

GCXXGGGXXX GAAXCXXACC AXCACCAXXX ACXAGXXCXG XAAXACAGXG CAAGXXACAG 7560 

ACAXCXCXGC ACCXCAGXXX XAGXAXGXCX AAAXXGGGGA XGAXAAX6CC TXCCXXGXGG 7620 

GGATAGXGXG AGGATXGAAX AAGAXGAATA CACAXGGCXG AGCACACAGC AAGCACXAAA 7680 

XAAGXGCCAG XXTXAATGAX AACGGXGAXG AXGAXGAXGA XGAXGAXGAX GACGTAACAX 7740 

XGCXXGXGGG ACXCCAXACA GCXCAGXAGA XGCXXGCXCA AAGAAGCAAG XXACCAAAAX 7800 

XTXXGXAAXG GXXCXAXGAA CGXGAAAAAA GCAGtCAACX XCXCXGAGGA XCAAXXXCCX 7860 

XAGXXTCCAA XTAGGAAAAG XCXXCXXAGC XCCAGAGXCC CACAGGGCXA AXGGAAXAAG 7920 
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GAGAGGATAG ATCACACATG TATTATGCAA ACACAACTCA GGTGAGCTCT ATTCTTCCTT 7980 

CTCAGTTATC CCTTCTGTAG GGACCCCAGT GTCCCCTGCT GTCTTTCTGT GTCCTGACCG 8040 

GGAAACACAG TGTGCCTTGT CTACTCCATC ACTTGGCCAG CTGCATGCTT TCCTTTGCAG 8100 

GCTTGAA6CA AAGCTGGGTC TCGGACATTC TCAGGCACTG ACAAAGCTGT TTAGTTGTTG 8X60 

CTGG6AAACA CTCGGAAATA GCCCTTTTGT TAAACACACA GAAACTAGCC TTCGCCCTGA 8220 

GCGAAATTCC TTAAACTCGT CTATGAAATT CCATAACCTG ACTCCTTAAC TGCAGACATA 8280 

CCCAGCTAGA ACATCCCTCA TGTCCCTGTC CACCGTGAGA ATGCTGCACT TCACTCTGAA 8340 

CCTTTAGTCC TCCTTTTAAA TACTGCACAC TGATCACCCT GGTGTTTAGT GCTTTGTTTT 8400 

TTGGAATCCC ACCTGGCTCC ATTTTGGGAT GGTTCC6GGC ACTTCCCTAT GGAAATTCCC 8460 

CTGCTGTCAC TGTCAGAGTG AGTCCAGCAG TGGGTTTAGC TGGATGAAAC ACCACCATGT 8520 

CCATTTCCAT TCAGACTAAT GTCAGAATTT GAAAGGCACT ATGGTAGAGT AGAAAGAACA 8580 

AGGAACTGTA CTATTTAAAG GGCAGGCAAA GAAAA6GCAT CTATAGCTTA TAAGATGT6T 8640 

GGATCTTTGG ATGTGACT76 GCCATCCTGA GCCTAAGTT6 TCTTGTAGGA GAAATGGGAA 8700 

TGAGAATATT TTCCTCTAGA CATCAAGAGG AAAAGAAATA TAACGTGAAA ACCTTTGTGA 8760 

ATTGTGAATG TGTTATACAG AGTAGCTAAA AGAATTAAAA AGGGAGTGAC AAAAAAGTAA 8820 

AAGGCAGCTG GCTGCTCAGG GCX^TCCATGG AGGGAAGTAC CTTGATATGG TCACTGTGGC 8880 

TCAGTGACAG CTCTGCAGGG AGAGGAAATT GATTTGTTAG T6CACCCAAA GTTGAATCTG 8940 

CTCCTGA6TA CT6ATTTATG GGAACCAAAC ACACAA6AGA TGAAGGATGT GTCAACCAGA 9000 

ATGTCCA6CA TTAGCTTGTG GGGAAACACA TACTTCCAGT GACTGAAATA CCATCCTGTT 9060 

ATCAAGAGAT CTGGGAAACT AAAGTACTGA CAAGAGCT6G CTTGATCTGT GGATTTAGAA 9120 

CAATGAGAGT TAGGTGGCCT TGAGGGAGAT GATTCACTCT CC7TCACAGA AGAGCTGACC 9180 

TCTGGGGTCA ACAGATATAG CACCTCTTTC CCAGGGACGC TACTGAATGA ACAGTGATGT 9240 

GTTCTTATAC TCTGGCCCAG ATTTTCTACA TACTTTCTTA GGTTACAACT TTATTTAGTC 9300 

ACATTTCAGT ACTGGGGATA CTCCTGTTTA TCTTCTTTGQ ACTCGAGTTT TTATGGGAAG 9360 

GTCATGAAAC AGAGAAAAAT ACAAT7TGCA GGGAAACTTA CCAAGGCTTG TllAGGTTACA 9420 

AGGATTAAAT GAAAACCCTG TGTAAGTCAG TATATAGTGA AGAAGTAAAT TGAGTTAGAC 9480 

CAAACGCCAA AATGCATCCG CATTAGAAAG ACGAtAAAGG AAGACTCTGG ATTCAGTTCT 9540 

GTTCAAAAAA CATTTTCTGC ACAAATACTA TGtATGAGGA ACTGGGCGTT GGGGAGATGA 9600 

TGATGAGTGA GACATGGTTC TTGCT7TCAG AGAGCCTAGA GACCTGGGTG GTAGCAATGG 9660 

TAGAGATACA TCCAA6AGAC AGAAATAGAT ATACAGGAAC ACAGATGATT GAAAGTGATG 9720 

CTTGGCAGGG CTTTAAAGAA TGAATCAGAG TTTTTCAGGC AGACGAGGAT CTTCAAGGCA 9780 

GAGGGAATCA TATAGATAAG GACATAGAA6 AGTGAAATTT CATGAAGTAG TTAAGCATCT 9840 

GAAGAAGCAT GGAATTAGTG ACAAGAAATG ATGCGGAAAA GATATCCAGA TCCAATCAAG 9900 

AAGGGCCTTG TTGGCATTCT ATGGAGTC7G GACTTTGGCT TCTGGGTCAC AAGTTCTCAG 9960 
ATGGG6TTTT CATATCTATT ATTAGACCTA CTATGTACTG GTCCAGTGGA AGGGAAAGGG 10020 
GTTGTCTTAC TGCTAGTGGA GTAGGAATTG GGTATGGACC ACAGCTT6TC TTGTTTCCAA 10080 
GTATTCCCTA AGAAATCTGG TCTGCTGATG GGAGATCTAT TTAT6GAAAT GTCTTTTTCC 10140 
CTCAGGAATT TTATGTCGGA AACA6CTGTC ATAGGT6A6G AGGAACTGGT AAAAGTACTT 10200 
AATAGGAGAG TGTCATGGTC AGATTGGTGT TTTGGAAAAG TCAGCCAGGG CAGATTGGAG 10260 
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AGGTCCATAT TGGACGCAGG AAGACTtAAG AGACTATTGC AAAGGTGAAG ACAAAAGACG 10320 
ATAGGGACTT GdACTTTAAT tCCAGCCCTT AGAAGTAGTA GAA6GTCAGA AATGAGAATA 10380 
TCCATTACAG AGATAGTTAG TTGCTATATC ATTAGGACXT GGTGATAGAT TGGATGA6GA 10440 
TGCGaTTGGG TGAGGCAAAG AGGAQAGTCC ACATTCCTGG TCTGGGTAGT AAGAAAGAAT 10500 
CTAGCAAGAG G6CTTGTGG6 GAAA6ATGCT GAGITACGTA GCAAGTGCAT CTGCTTTATC 10560 
CTTGTAATGA AtGGGGCTAA AGGTGTAAAC CAAAGAGTCA TCAGCAlTTTG GAGGGTAGAA 10620 
TAAATCATCA 6ATAACTCAG GAAGAA6GAG CAGAA6AATT ACTGATACTC CCT66AA6GA 10680 
AAACCGGAAG TAAATGGGAG AAACTTtGCTC AA6TGGACAA A6TTTAACAG ACATGAAGCA 10740 
TGAATTC 10747 

(2) INFOHHATXON FOR SEQ ID NO:2; 

(1) SEQUENCE CHARACTERXSTXCS: 

(A) r.ENGTH; 6693 base pairs 

(B) TYPE: nucleic acid 
(C> STRANDEDNESSt double 
(D) TOPOLOGY s linear 

(11) HOXiECULE TYPE: DNA (genomic > 

(ill) HYPOTHETrCAL: NO 

(iv) ANTI-SENSE t NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

CGTACAACAT GCTGTTTAGG ATCTTGCACA TGATAGCTAG 60 

AGGCACCCCA CTTACCAACT aTGTAGACTT GATCACGTTA 120 

TTCCTCATTT TACAAATGGG GA6AAAAATA 6CATCTATCT 180 

AAGTTAATAC ATATGTGCTA CGTAGAACAG TGCCXGGTAC 240 

GTTAGGTATT ATCATTATTA TCACCTCCAG AAACAATTTA 300 

CCAGACACCT CTCATTGTCT TCCCTTCCAA ATCTGCATTT 360 

CAGTGTGAGG CTTQAATCAC CTATCAAGCC TCACCTCCAC 420 

CCTAGAGCTT CtATTTACTC GTCTCACTGC TCT6TGGGCT 480 

Ca;rGCTATCC ATTGCTACCT GXTTTCTGTT GCTGGTGTCT '540 

AGAAAAGAAA CAAGAAAAGG AATAACTGAG AAACAGAGAA 600 

GGTTCTTCCA GACCACCCAC TCATCCATCT TGTTCAATGA 660 



gaattcggct gctgtgctgt 
gtattcttgc ttcaaatcgc 
ttcaacccct gt6tctctgc 
caaagttgtg aaaattaagc 

ATGGTCAGTT TTTGATACAT 
AACTTCTCAT ATAAGGCTCt 
ATCTCTCTCT CTTTGCAGTC 
CCCTGTGCTT TACAAAAT6T 
TTTTCACTCA AGGGCGTTTG 
GTCTCCTGCT CTATCTTXGA 
AAAAAATGTC tCTCCCTTCT 
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CAGCTTCTCT TCCTTTAATT AATCACTGTG GTATATTTAT AAAGCTTATA TTTATGAAAG 720 
ACCTTTTAAT TTTTTAGTTA TTAAAGCCCT TTCTCTTTGT CAGGTTGTAA CTGAGTGAGC 780 
TCTGGAGTTT GGAAAGAA6A TCTTAGAAAT GGGCCA6AGA GCTCCTTCTG AGATCCAAGC 840 
ACGGAGAATT GCACCTGCTG TGCATG6TAA GAGAGTGTGC TTGGTA6CTC ACAAGGGCAA 900 
GGTGA6AATA GAAACTTTCA 7GCCTTTTTG AT6GGGGTTA TGAAATCCTA CCAAGAAACA 960 
CCAGGTATCA GATGTGGGGT CCTGTTTTCC CAAAGCCACA AATGCTTGAA GGAAGATCTT 1020 
6TGTGATAAA ATAATTACCA CATGAACCAA TCTTGCATGC ACAGCAATTT TGAGAGCCCA 1080 
TCCTGGGAGC TAGGTGTGTA GTGTTTATCG TATTGTTGAG GCTC6TAAAA ATCTTGTATG 1140 
GCTGCAGGCA AGCCAAACCC TTGACAGGCA CTGCATCTCC GCTGACTCTA GAAGACCAAG 1200 
CCCAATTTCT TCCCTGTATA TAAGGGGAAG TCTCTATGCT TGGGGTAGAG GA6TGTTTAG 1260 
CTCCTTCCCT TACTCTACCT TGCTCCTACT TTTCTCTAAG TCAACATCGA ATTTGCCTCC 1320 
TTCATTGACA AGGTGAGTTT CTCTCTCATT GCACTGGTAG GGCTGCCGCT GGTCCACTTG 1380 
GGATTGGTGC AGTCAAAACA CAT6TAGGTT TGAACCTCAA GTTTCCATGT TTACATGATT 1440 
AAAAGGATGT TTTGTGGAAT GGTCTCCTAG GAGATATGTT AGATGTATGC TTGTGAATGG 1500 
TGTTAATGAC TCTCTCTTTG ACAAAGGGTT CGTGGTCGAC CTAAAGGTGG GTCAGTGTGA 1560 
CATTAACATT TAAGTGCTTT TTATTCAGCT CTTGAGCGGA AXTGGGACTC ATATCTGTTG 1620 
AATGAAGATA ATAGAAATGG G6CTAACTGA ACTTTCCAGG GTGCAAGTGA GAACCCTGGA 1680 
AA6GTCTTCC TAACCATAGA AAGGGAGTTG AGTGTGAACA TAGTATAGAG TGTTATTGTA 1740 
GCAGAAAACA TGTGGTCAGT CAGTGCCAAA CATCTTTTGC TGTCAGAGGG GAGCTCTGCC 1800 
TTCTAATAAT TTTACATTGG TACTGGATGA GGCTAGAGTT TTTTTATACT AATATCTCCA 1860 
AAAATCA6CT CTAAAAAACT CAGATAAACC A TT TT TT TAA TTTTTTGCTT AATCATTAAT 1920 
AGTGCCAATC CAAGGTTATC CACAACAAAT TTCAAATCCA ATTTTGAATT TTCCTGATAT 1980 
ACTTTTGAAA TGTGTGTGTG TCCTGGGGAT GCAAACCAGT TTTTATGGTA ATATACCTAA 2040 
CAAAATTTTG GAAGGCAAAT CTCTTAAATA CCATGCACCT ATTTCAAAAC ATAATTGCAA 2100 
TAATTCTGTA TGCGCTTTGC TATTGGTATT TGTTTAGTTA CTCCCTTCCA AGCCCTCTCT 2160 
GAATTAACAA GTTGGGTTTT ATTATGCAGA TGATATTAAC TTGATCATCT TCTTCCTATT 2220 
TCTCTGTCAT 66TCAGAAGA TAGGAATTGA GGTTCTTTTC CAAATGAGGC ACAGTTCTCC 2280 
ATGGCTATGA 6ACTCCATTT ATGCATCAGG AGTAAA6GGG TCTTGTGTTT TTAGGTGAGG 2340 
TTCCTGGAGC AGGATCCCGG GTACCX3CGGC CGCATCGATT CGATAAGAGA TGCCCTCTGT 2400 
TTCATTAGCT CTAGTTCTCC CCCAGCATCA CTAACAAATA TGCTTGGCAA GACCGAGGTC 2460 
GATTTGTCCC AGCCTTACCG GAGAAAAGAG CTATGGTTAG TTACACTAGC TCATCCTATT 2520 
CCCCCAGCTC TTTCTTTTCT 6CTGTTTCCC AATGAA6TTT TCAGATCAGT GGCAATCTCA 2580 
GTCCCCTGGC TATGACCCTG CTTTGTTCTT TCCCTGAGAA ACAGTTCAGC AGTGACCACC 2640 
ACCCACATGA CATTTGAAAG CACCTCCTTA AGCCAGCCAG AGTAGGACCA GTtAGACCCA 2700 
GGGTGTGGAC AGCTCCTTAG CATCTTAtCT CTGTGCTGTT TTGGTTTTGT ACATAAGGTG 2760 
TAAGCAAGTT GTTTTTCTTT TGTGGAGAGG TCTTAAACTC CCCATTTCCT TGTTTTGCTG 2820 
CAATAAACTG CATTTGAAAT TCTCCATGTC TCOATCGCCC TTGTTTACGG CACTGTCTAA 2880 
CCTGGATGGG TGTTTTGTGA GGTAAAAGAA GACACTAGAG CCACATGGCA TATGGGAAAG 2940 
TCATGCACAC AAACATGAGA AAAATGCAGA GGCCAACCAG GCAACATTTC ACCAGACXGG 3000 
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AATCACAGAG AGAGCAACSCA CTTTCCCAGA TGGXGGGGAT GTCATG6A6A AATGGAGAGA 3060 
CCGGGT6ACA GGTTTTGTTC ATTT6A6AAG GCTTTCTTGA AAAGGGCA6T GAGCAAGCAG 3120 
GTTGGGAGGA AGA6GTGT66 CAITTGAGAAG AAGGGAAAGT ATTGCATGAA AAAGTAATTC 3180 
TTCAC6TGGA ACAGCCAGTA AGGA6GGGCA TGAGTAATAT AG6GTCAGCA GTTACTGGAG 3240 
CCA6AATACA GACTTTGGCC TGGGGA6TTC AAGAACTAAG AGTGGTAATA GAGAGTTGGA 3300 
TATTCCATTT CCCTTCTCXT TTTGTCCCAC CACCCAAAGC TCTGCATAM CTAAGAAGtT 3360 
CCCTTGTTGA CACATAGCTC ATACTXGTGA AGTTGTACAA CAGGATA6CA TAGTGGCCAG 3420 
AAGCATGGAC AGTT6AACTC AGATATGCTT GGGTTTGAAT CTTACCATCA CCATTTACTA 3480 
GTTCTGTAAT ACAGTGCAA6 TTACAGACAT CTCTGCACCT CAGTTTTA6T ATGTCTAAAT 3540 
ITGGGGATGAT AATCCCTXCC TT6TGGGGAT AGT6TGAGGA TTGAArtAAGA TGAA7ACACA 3600 
TGGCTGA6CA CACAGCAAGC ACTAAATAAG TGCCAGTTTT AATGATAACG GTGATGATGA 3660 
TGAT6ATGAT GATGATGACG TAACATTGCT TGTGGGACTC CATACAGCTC AGTAGATGCT 3720 
TGCTCAAAGA AGCAAGTTAC CAAAATTTTT GTAATGGTTC TATGAACGTG AAAAAAGCAG 3780 
TCAACTTCTC TCAGGATCAA TTZCCTTAGT TTCCAATTAG GAAAAGTCtT CTTAGCTCCA 3840 
GAGTGCCACA 6GGCTAATGG AATAAGGAGA GGATAGATCA CACATGTATT ATGCAAACAC 3900 
AACTCAGGTG AGCTCTATTC TTCCTTCTCA 6TTATCCCTT CTGTA66GAC CCCAGTGTCC 3960 
CCTGCTGTCT TTCTGTGTCC T6ACCGGGAA ACACAGTGTG CCTTGTCTAC TCCATCACTT 4020 
6GCCAGCTGC ATGCTTTCCT TT6CAGGCTT GAAGCAAA6C TGGGTCTCGG ACATTCTCAG 4080 
GCACTGACAA AGCTCTTTAG rTGTTGCTGG GAAACACTGG GAAATAGCCC TTTTGTTAAA 4140 
CACACAGAAA CTAGCCTTCG CCCTGA6CCA AATTCCTTAA ACTCGTCTAT GAAATTCCAT 4200 
AACCTGACTC CtTAACTGCA GACAtACCCA GCTAGAACAT CCCTCATGTC CCTGTCCACC 4260 
GTGAGAATGC XGCACTTCAC TCTGAACCTT TAGTCCTCCT TTTAAATACT GCACACTGAT 4320 
CACCCTG6TG TTTAGTGCTT TGTTTTTTGG AATCCCACCT GGCTCCATTT TGGGATGGTT 4380 
CCGGGCACTT CCCTATGGAA ATTCCCCTGC TGTCACTGTC AGAGTGAGTC CAGCAGTGGG 4440 
TTTAGCIGGA T6AAACACCA CCATGTCCAT TTCCA7TCAG ACTAAT6TCA GAATTTGAAA 4500 
GGCACTATGG TAGAGTAGAA AGAACAAGGA ACTGTACTAT TTAAAGGGCA GGCAAAGAAA 4560 
AGGCATCTA!r AGCTTATAAG ATGTGT6GAT CZTTGGATG7 GACTTGGCCA TCCTGAGCCT 4620 
AAGTTGTCTT GTAGGAGAAA TGGGAATGAG AATATTTTCC TCTAGACATC AAGAGGAAAA 4680 
GAAATAZAAC GTGAAAACCT TTGTGAATTG TGAATGTGTT ATACAGAGTA GCTAAAAGAA 4740 
TTfiAAAAGG& AGTGAGAAAA AAGTAAAAGG CAGCTGGCTG CTCAGGGCCT CCATGGA6GO 4800 
AAGTACCTTG ATATGGTCAC TGTGGCTCAG TGACAGCTCT 6CAGGGACAG GAAATTGATT 4860 
TGTXA6IGCA CCCAAAGTTG AAXCTGCTCC TOAGTACtTGA TXTATGGGAA CCAAACACAq 4920 
AA6A6AT6AA GGATG7GTCA ACCAGAAT6T CCAGCAnAG CTTGTGGGGA AACACATACT 4980 
TCCAGTGACT GAAATACCAT CCTGTTATCA AGAGATCT6G GAAACTAAAG TACTGACAAG 5040 
AGCTGGCTTG ATCTGTGGAT TTAGAACAAT GAGAGTTAGG TGGCCTTGAG GGAGATGATT 5100 
CACTCTCCTT CACAGAAGAG CTGACCTCTG GGGTCAACAG ATATAGCACC TCTTTCCCAG 5160 
GGACGCTACT GAATGAACAG TQATGTGTTC TTATACTCTG GCCCAGATTT TCTACATACT 5220 
TTCTTAGGTT ACAACTTTAT TTAGTCACAT TTCAGTACJTG GGGATACTCC TGTTTATCTX 5280 
CXTTGGACTC GAGTTTTTAT GGGAAGGTCA TGAAACAGAG AAAAATACAA TTTGCAGGGA 5340 
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AACTTACCAA GGCTTGtAAG GTTACAAGGA TTAAATGAAA ACCCTGTGTA AGTCAGTATA 5400 

TAGTGAAGAA GTAAATTGA6 TTAOACCAAA CGCCAAAATG CATCCGCATT AGAAAGAC6A 5460 

TAAA6GAAGA CTCTGGATTC AGTTCTGTTC AAAAAACATT TTCXGCACAA ATACTATGTA 5520 

TGAG6AACTG GGC6TT6GGG AGATGATGAT GAGTGAGACA TGGTTCTTGC TTTCAGAGAG 5580 

CCTAGAGACC TGGGTGGTAG CAATGGTAGA GATACATCCA AGACACAGAA ATAGATATAC 5640 

AGGAACACAG ATGAtTGAAA GTGATGCTTG GCAGGGCTTT AAAGAATGAA TCAGAGTTTT 5700 

TCAGGCAGAC GAGGATCTTC AAGGCAGAGG GAATCATATA GATAAGGACA TAGAAGAGTG 5760 

AAATTTCATG AAGTAGTTAA GCATCTGAAG AAGCATGGAA TTAGTGACAA GAAATGATGC 5820 

GGAAAAGATA TCCAGATCCA ATCAAGAAGG GCCTTGTTGG CATTCTATGG AGTCTGGACT 5880 

TTGGCTTCTG GGTCACAAGT TCTCAGATGG GGTTTTCATA TCTATTATTA GACCTACTAT 5940 

GTACTGGTCC AGTGGAAGGG AAAGGGGTTG TCTTACTGCT AGTGGAGTAG GAATTGGGTA 6000 

TGGACCACAG CTTGTCTTGT TTCCAAGTAT TCCCTAAGAA ATCTGGTCTG CTGATGGGAG 6060 

ATCTATTTAT GGAAATGTCT TTTTCCCTCA GGAATTTTAT GTCGGAAACA GCTGTCATAG -6120 

GTGAGGAGGA ACTGGTAAAA GTACTTAATA GGAGAGTGTC ATGGTCAGAT TGGTGTTTTG 6180 

GAAAAGTCAG CCAGGGCAGA TTGGAGAGGT CCATATTGGA GGCAGGAAGA CTTAAGAGAC 6240 

TATTGCAAA6 GTGAA6ACAA AAGACGATAG GGACTTGCAC TTTAATTCCA GCCCTTAGAA 6300 

GTAGTAGAAG GTCAGAAATG AGAATATGCA TTACAGA6AT AGTTAGTTGC TATATCATXA 6360 

GGACTTGGTG ATAGATTGGA TGAGGATGCG GTTGGGTGAG GCAAAGAGGA GAGTCCACAT 6420 

TCCTGGTCTG GGTAGTAACA AAGAATCTAG CAAGA6GGCT TGTGGGGAAA GATGCTGAGT 6480 

TAC6TAGCAA 6TGCATCTGC TTTATCCTTG TAATGAATGG GGCTAAAGGT GTAAACCAAA 6540 

GAGTCATCAG CATTTGGAGG GTAGAATAAA TCATCAGATA ACTCAGGAAG AAGGAGCAGA 6600 

AGAAXTACTG ATACTCCCTG GAAGGAAAAC CGGAAGTAAA TGGGAGAAAC TTGCTCAAGT 6660 

GGACAAAGXT XAACAGACAX GAA6CAXGAA XXC 6693 

(2) INFORHAXION FOR SEQ ID NO: 3: 

(1) SEQUENCE CHARACXERISXXCS : 

(A) LENGXH: 24979 base pairs 

(B) XYPE: nucleic acid 

(C) SXRANDEDNESS : double 

(D) XOPOLOGY: linear 

(li) MOLECULE XYPE: DNA (genomic) 
(iii) HYPOXHEXICAL: NO 
(Iv) ANXI-SENSE: NO 
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(xl) SEQUENCE DESCRIPXrON: SEQ ID HOt^t 

GAATTCCAAG CTTTCTGCT& TAA66AGGGA CCTCAG6GAG CCAAGGTCAG CCTGCAGCCT 60 
TTCT6TGCTC CTTTGCCTCG CCTGACA66T ATGAGGATGA AATCAACAA6 AGGACTGGCA 120 

GCGAGAATGA CTTTGTCGTC CTGAAGAA6G TGAGGGAAAG GGGAGTCCTG AGGGTGGCTG 180 

TGGACCCAGG AGGCTGA6GG 6AGTGAGGAA TCCCTAXGGA TGCTCTG7GA CAATGGCAQG 240 

GTGGCCTCTA CGGCCGGCtT GCTGTGTAtG ATGCCTGAAT 6C6GGGCCCT TACATT6GAA 300 

CT6ACACTGA TAATGACTCT TCAGGAAGCC TTGAGTTCGT ATCTCTCTGO GGTCTGAAAG 360 

TGAAATGAAG TGAAATGACA GCTXITGAGX GTCAQTTACC TGTAGCCTTG GGACCTAAGG 420 

AAGGACCTGG QGTGTTGGTT GTGACTGACT GGGATGTGGA GGTTGGTGTC ACATCTCCTT 480 

CT6GCCAG6A AAGCGAGGAC TTGTGGGTCC TTATTCGAGT GCGGTGATGA ATTTTTTAAG 540 

TAAGGAAATA AACCTAGAGT GGCTCTGGTC CTGAGCCAGC CAGTGAGCTG TGGCAGGCAA 600 

TGCCTGGGCA ATAAAGTCAA ACTGTTCTGC CTGCTATTCA GCATGTGGAT GCTGCTTATG 660 

TGA6CAAAGT GGACCTGGAG TCCAGGGTGG ACACTCTGAC TGGGGAGGTC AATTTCTTGA 72 O 

AATATTTATT TTTGACGGOTG AGTTAAGCCT TTATAAGAAC CTCCTTTCTT TTCTCACATC 780 

TCACAAGGAG TAT6GGCTGX AAGAGGGGAG GCCT6AAACC CAACACTACC CACTAGGGAC 840 

TCATCTCCCC AGGTACGCCA ACTCTGTGGG CGTGGAGTCA GCCATCCTCT GCACCCCAAT 900 

GCTCAGAATC CCCAGGTCG& GGTAATGAAG ATGGAAOGCT GGGAGAATCC TGAGTTAGGT 960 

GGAG6CGAAT GTGTCCCT6G RCATGGCTT CCAATCTGTC TGGGAIUITCA CCCAGACATA 1020 

TAAGGGGCAA AACCAACCAG AAATCTTCAT 7AATTCTGGG GAGTTGATGG AGCXGTTAGG 1080 

AACTCTGTGG GAGGXGACAG XGXGAGXCXC AA6GAGXGGA CXGACCXXAG XGAXGGGGGA 1140 

XCAAACACXC CACCACCCGG CCCXCXXXXG CCXGXGTCXA ACXXGGGGGX ACGXGCXCXG 1200 

GGCCAGAXGC XGXGXXAOAA GXXXAXGXXA XG6GXAXCXC CAXXCXACAG AXGGGAAAAC 1260 

XGAGGCACXG AGGGGXXAAA XXACXXGCXX CAXXACCXAG CXAGXCAAXG GXACAGCCAA 1320 

GACXCAAAAG XGA6XCCAAG XGACXCCXXA ACXAGAGXCC AXCXACXGCC XCGGAGXACX 1380 

CAXGXGGXXX CAA6GAAGAG GCAXGCCXGC GAAGGAGCCC AGCXCACXAX GGXGGCCAAG 1440 

XCAGAGCAAG GCA6AGXGGC AGCXGCAGGA GAAGXGXGAX GGGGAGAXGG XAXCXGAACG 1500 

CICCAGGXXX AGGCXCCXXC CXXCXCCCCX GGAAGGCAGX XAAGACXCXC CCXAXXAXCX 1560 

CXCAXXGCAC ACAACAAXXC CAAGAGCXXX XCCCAAGACX ACCXGGCCCA GGCXXCXGGC 1620 

XXC^CCCGAG AGCCXXGAGG GAGCAGCAGA GGAAAACXGA GGCCCCCAGA GGAGAAXGGA 1680 

AGGAGXCAGC CXGXGCGCCA XGCCXCGCAG GAGCXGXCTC AGGXGCAGAC XCACAXCAGC 1740 

GACACCAACG XCAXCCXGXX CAXGGACAAX AACCGXXCCC XGGACCXGGA CAGCAXCAXC 1800 

GAXCGAGXGC GGACCCAGXA XGAACXGAXX GCACAGAGGA GCAA6GACGA CGCCGAAAGC 1860 

CTGXACCAGA CCAAGGXGGG CGXGGCCCAG AXCXGGXGCC CAGAAAAACA GAXXCXXCCC 1920 

AGAAXXCXCX XICTCXXAXX GCAXXGXCXX XCXCXXAXXX CXGAAGXAAA AXGXGXXXGX 1980 

XAXACAAATT CXAGAAAXXA CAXGXAAAGA XXACCCAXCX CXCACXACCG CXAXXAAXAX 2040 

GXXAAXAXCX CXXCXACCAG X^CXXXGXCG CXAXXAGGCX AGXGGAAAAG XGAXXGCGGX 2100 

XCXCGCCAXX AAAAGXAAXG ACGAGAACXG CAAXXACXXX XGCAAAAACC CAAXAAXGXX 2160 

XAXXGAGAAC XCAXAXGXGX XAGGCACCXA GCAAAGXGCX XXACXXAXXX AXXATXAXXX 2220 
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CATTCAGTCC TTACAACAAC CAATGAGGTA AGAATTTCGT TATCACCATT TTATAATAGA 2280 

TAGTAGTGTG TGACATTACT TAATTTCCCT AATGCCTTGT AGCTAGTAAA TGCAGAGCCA 2340 

GAGCTTAATT AAAATTGGTT TGTGTCtACA AACCCATTCC CCTCACCACT AGAATGATTT 2400 

TTATTCTTTT TTCATAATGG TATCTATTAA AATATATTTT TTTTACTTTT XTTTTTTTTT 2460 

GAGATGGAAA CTCACTCTAT CTCCCAGGCT GGAGTGCAGT GGCCTGATAT CAGTTCACTA 2520 

CAAACTCCGC CTCCTGGGTC CAAGTGATTC TCCTGCCTCA GCCTCCTGAG TAGCTGGGAT 2580 

TACAGGCACO TGCACTACAC CAAGCTAATT TTTGTATTTT TAGTAGTGGC AGGGTTTTGC 2640 

CATGTTGGCC AGGCTATACT TTCTCTTTTA CTTAACACAT ATGGATACTT TTCTGTGACA 2700 

CTAAATAACC TGCCGCATTT TTAACAGCCT GGTATTATAT TGTTAGACTA CACCCTCCCT 2760 

TATTAGATCA ACTCCTTTGT GGCTAAGTTG TTGGGCACAT CTTTGGTTAC TTCTTTTCCA 2820 

TAAACT6ACC tGGATCTTTT TG6ATGGTGA AGCCTCTGGT TGAAAGGGTG TGGCTGACAG 2880 

TCCAGTCACT AAATTCTGAA CAACTAGCAT TGATGACTGG CTTTGAGGAT GATCTGTGGC 2940 

CAACTCCAAT CCTGGCTGAC CTCT6TCCCA CGGTCCTGCA CAGTGTTCTG GGGTGGAATG 3000 

GATTTCGACA TTAGACTAGG AAGCCAGATG GCCAACAGTG AAAAATAGCA GAGTGTACCA 3060 

GATTCCCTTG CAAGTCGATG CTTCTCCTAC CCACTTCAGA GCCCTGTGCC TGGGGGGT6G 3120 

AGTTCTGACT AATGGGGCAA TACAGAGACA GAAACAGAGA TGGAGGGGAA ATGAGACTGA 3180 

ACGTGGAGCC AATGGAGGGC CTCTGAGGAC ATGAGGTCTG CTTGACTGCT AGGGAGATCA 3240 

TCCTGGAAAA GGGTGGGAAG CTATATGGTG GGTGGAAAGA GTGAGGGGGT CTCAGTGTGG 3300 

GTAAGGACCA ACGT6AAGGC TTAGATGTGT GAAAAGGTGG TAGAAGGGCA TCACAAAGCA 3360 

GGTTTGTCTG GCCTGGGATG AGAGTCTGCC CAGAGACTGG TGGGAATGCG 6GAGGCTTGG 3420 

GATAGTGTGA GTGTGTGCAT GTATACATGT GTTTGCAGCC T6GGTGAGGG AGGTTTGGTA 3480 

TAGCTGTGAG TATGCATGTA GGGGTGACCA CAGTGCAAGG TGGGTGGGAA TCTCCCAGGG 3540 

GAGAGCAGCC CAGACCTACT CCTCCTGGAG GGGCTT6TGG TGGGCAGCAC ATGCTGACTA 3600 

tGATGCTCGC TTTGGCCCCA GTACCAGGAG CTCCAGATCA CGGCAGGGAG ACATGGAGAT 3660 

GACCTGAA6A ACAGCAA6AT GGAGATT6CA GAGCTCAACC GCACCGTCCA GAGGCTGCAG 3720 

GCAGAGATCA GCAACGTGAA GAAGCTGGTG GGACGGGTGC TTAGGGAG6G CTGACCAAAG 3780 

CCCTGCACCT CCTACAATGC CCTGCCAGAT CGAGCTCTGG AAACTTAACC ATTAAATGGT 3840 

CTCCAACTGrr CTCTGGAGCA GATTGAACAG ATGCAGTCAC TCAXTTCGGA TGCTGAGGAG 3900 

AGAGGCGAGC AGGCCCTCCA GGATGCGTGG CAGAAGCTGC AGGACCTGGA GGAGGCCCTG 3960 

CAGCAGTCCA AGGAGGAGCT GGCCC6GCTG CTGCGTGACT ACCAGGCCAT GCTGG6GGTC 4020 

AAGCTG7CCC TGGATGTGGA GATCGCCACC TACCGCCAGC TGCTGGAGGG CGAGGAGAGC 4080 

AGGTGGGTCT GGCAGCTGTG TTTCTGGGGC TAAGGCTTGA GATGCACCAT GAAGCTGTGG 4140 

GACTGGCTAT TT6GAGAAAA GATAAGCCCA CCTTTTTGGG AAGATTGGTA GCCAGGTGAG 4200 

CAGAAACATT CCAATTAGAG GCAGAGGCTG TGTGAATGGA CAAGCCTCTT CACACAGGGA 4260 

GAAGTCATTG TTATCATTCC TCCrACCTCCA AGTAGAATGT CCTTATACCC CAATCCAAGC 4320 

CTCTGCAGCT GGTATTCACC CCCAATGCTA AAAGGCTTCA TGAAAACCCT GAAATTTCTC 4380 

TCTGCCCCAC TGGCTTCCTG J^CCTCTGCTC ATGCACACAC ATTTCCCTAA GGCTTGGGGA 4440 

CACCTCTGAT CCAGATGTCT GTGGCCACA6 CCTTCTCTCC TCAGGCCCTG TGGGTCTGGC 4500 

TGACCGTGTG CTTTGGTTTT ACAGGATGTC AGGAGAGCTG CAGAGCCATG TGAGCATCTG 4560 
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TAAGTA6CA6 ACCGAGGGGC AGAGA6AG6C TGGTGGTGCT GGGT6GAGGG AGGGCCAGGA 4620 

GGTGGCCAGC AGA6AACGGA AAGTCTGGCA TTTTAGCTTC CAGTdCTGTG CAATAGACAC 4680 

CAAAGTAA6C AAGTGTAATG CAAAGCCTGG AAGAATTCAT TTCAAATAAA TGGTTATGAT 4740 

TTCAGGTCTG CTTATCXTAA TTGTTATGAT GCCTTTTTAT TATkATGATGC CTAGGA6GAA 4800 

TCAGCAGCG6 CTAGAACTCT XTAGGGXACA TATTCAATAA ACAATGTAAG TGTGTTGCTG 4860 

A6AGGAACC& TGGCATCCCT XTGTAGTATG AAGAATACTT TTCAAGTAGG AACACXT7CA 4920 

AITTTCAATG TATCGGGTTT GCAAGTCGAT GCCACGGGXG AXCGAGGAXG GAGGAGGCXG 4980 

CAGGXGCAGG GCGGGXGCAG GXGGGGGCCX XGCCXGXCCC XCXGACCCCG XGXGCACXGT 5040 

CCCXACCCAC AGCCOXGCAG AACA6CCAGG XGAGCCGXCA ACGGCGGCGC GGGAGGCGGC 5100 

GGCAGCXACG 6CXCAGGAGG CXACGGCGGC GGCAGCGGXG GGGGCXAXGG CGGCGGAA6A 5160 

AGCXACCGCG GAGGCGGGGC ACGAGGCGGG AGXGGAGGCG GXXAXGGCAG CGGCTGCGGC 5220 

GGCGGXGGCG GGAGCXACGG A6GGAGCGGC AGAAGCGGCC GCGGAXCCXC GCGCGXGCAG 5280 

AXCATCCAGA CCXCCACCAA CACCXCCX^C AGGOGGAXCX XGGAGXAGAG GCCXCGIXTC 5340 

XGCCACACAT CACGCCXGCC GCXCACCGAC CXCXCCICAA ACICCXCCCC TCCACGCCCT 5400 

XCCXAAXCCC CXCXCAXXCA CXXXXCXXAA TGGGTCTCAG CAAXXXXGCC AAXAAAXXC6 5460 

ACXCTAAXGG GGGAAGCAG& GXGGATAAGX CCAAACAGCA GAXCXCXCXT XTGGAGGGCA 5520 

CXGGCXXGCA GXCAGAXXCA CAGCXAGGCft CAXXCXCACT CAGACCCCGC TCXGCIGGCC 5580 

CXGCTGCTG^ XCCXGCXCCC ACCXXXXTGG AAGAXCGGXA GCCCAGGGXG AGCACAAACA 5640 

XXCCAGXXAG AGGGAGAGGC XGCGXGAGXX GGCAAGGXAG GGAGAAGTCA TTGXXAXCAT 5700 

XCCICXGCCX CCAAGXAGAA IGCCCTXAXG CCCCAGXXCA AGCCACXGCA GCXAAGXAXI 5760 

AACCCCCAGT GCXAAAAGAT ACGA6GCAXC XAGXXXAGCA AXGGAGGGAA ACAGAAACAG 5820 

CCXGTAGAGA GCAXCGACAA GGCGGAXAAX GGAGAGXX6G TAXCTCAACC CCAAGCXCCC 5880 

XXXGCXGXAC CXGGGCCXGC XCXGXGAACA AGAAXCCACG CCCCCCXGCC CXGCXGGGAC 5940 

CACATAAXGA TCCCXXX6G6 GAAGXTGCXG AXXGCAGGGC AAGCXAGCXT GGXAAGGAAA 6000 

ACCXCXGCAC CAGCGGCCTA IXCCXGCCXC XXGGTCCAXA GCCXCAXACA CXCAXGXXGA 6060 

XGGATAGXAT AGAXIXGCIG CCCACACCAG AXAXCTGXAA GGCAXCACXG TCCIGATXCT 6120 

GAACCXCXGX TXCAGGAAGC AXXCXCCCCX GXGXAAACAA CXCAAGGXGG AAGXAXXXCA 6180 

GAGGGCAXAG GGXCAXGAAT CCXXACCCAA AGGAAGCCXG XXXXAGCAGX G6AXGCAGGA 6240 

G^GAXGAAC AGACAAGCAA GXXCXGCXXC XOXCCXGXXX CCXCCXGACA GCXCCAXXCX 6300 

TXXGAAGCCX QACCCXXCCX AA6CTCXGCA XCAXAACGAC XCXGAGAAXX GKSCCCAXXGG 6360 

XGGGCAXGXG AAGCCAGCXC XGXXCCAXCC AGGXGCCXCG GGCCXGAGGA GXCXGAGGAT 6420 

CXGACXXGGG XCXXGGAAGG GXXCCAACCC AAGXCAGXCA GGAAGCXGCC CAXXXXXXXG 6480 

CAAGGCAXXX XAAXGCCXXT CCCAGACCXC XCXAGXCCCX CCXGCCXXCT GXXCXCXCGA 6540 

CAGCXGXGAG CCCXXXAGAG A6AAXAXGAC XCXXAAXXXX GAAXCXXAXG XAAGAGGCXX 6600 

GAGATGXGCX GGAGAGGGCA GGAAGAGGAA AGXAXCA6GC CXXGAGAGAG GGAAXGXAGC 6660 

XXXGCXXCXA XGAXCXGGAG XCACCXXCAC XIGCXAGCXG AGICCXAACA CAACXXCCAA 6720 

GXCCaVXGAXX CXCXXGGGGC AXJTGGAXGGG CXCAGXGXGG GXCXCXXAGG CXGXXCXXGX 6780 

GACXTCAXCA XXICCXGGXX CAAAGXXGXA CIQXCAAGGG GCAGCAXXXC XGGXAXXXCX 6840 

AXAATAAAXX XXCXGXGAXC XCAAAXXGCX GXXXGGXCAG GAGAXGCAXX AXXXCIXCXX 6900 
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CTTCTTCTCC TTCTTCTCCT TTGCCTTCCT CCTCTTCTCC CTCCTCTGCT TCTCCTTGCT 6960 

CCTCTGGCTC CTCCTCCX30G CTCCTCCTCC TCCTCCTCCG CGCTCCTCCT CCGCGCGCTC 7020 

CTCCTCCTTC TTCTTCTTCT CCTTCTCCTT CTCATTCTTC CATCTTCATC TTCATCTTCT 7080 

CCTCTTTTTC TTCTTTCTTC TAAATAAAGA TGGGGTCTCA CCATGTTTCC AAGGCTGGTC 7140 

TTGAACTCCT GGGCTCAGGT GATCCTCCCA CCTTG6CTTC CCACATTGCT GGGATTACAG 7200 

GTGT6AGGTG TGGTACCTGG GCTATTTCTT TAAAAATTXC TGCAGACCTC TGAAATTATT 7260 

TATATTTGGG AA6TTAAAAT TTCTTCTTAT TTTTTATTGT ACAAGTAATA CACAGTCTTG 7320 

AAGAATCTTA CAGACATAAT CTTATTAATC CTTAAAGTGG CTGATCATCC AAAAGTCAAT 7380 

TATACATTTG TTCAATGAGC ACTTATTAAG CTCCTACTGT GTGGCGGGCA GTGGCTTAGG 7440 

CACTGGGAAT GCAATGTTGA ATGAACATGT TTCTGACTCT TAAGTTGCTC ACAACTAAAT 7500 

GACATATTAT GGGGGAGGGA CGATTCAAGG AGAGAAGAGA AATCTGAGTG TGCTTCTAAG 7560 

GACCTCTAGC CTGAGAGTGG AAGCAAGGCC TATCCTGAGG ACACA6GCAG ACCCCCCAAA 7620 

ACAGGAACAG GTGGGACTTA CGACAGGTGC CAGTGCTGGG GAAGGGAC6T TTGGTTCCAA 7680 

CAGACTCCTG GAGGACT6GG ATATGGAACA GGGCCAAGGA AGAGAGGT6T GGGTGGGGAG 7740 

ATGAGGGAAG GGCCCTCCAA ACAGGGGGAT AGTCTGCTCA GAGACTCAAA ATAAGAGAGA 7800 

GTGTGGGGG7 GAGAAGGAGC AGCTGGACAG GAGAAACTGA GCTAGGGAAG GAAGGGGCTG 7860 

AGGCCACAAA CTGAGTGGGG TCATGGGCAG A6ACATCTTC AATTGATGCC TTGA6GGAAG 7920 

CAGAGATGCA GAAATTCCAT AAT6GAGCAA GTTAAGCCAT CACCTCATCC TATGTGGTAG 7980 

TTCTCAGTCC ATGTAAAA6A ATCACATAAA AGATGTGATC TACTTTCTAA TTCCCTGGAG 8040 

GACTTTGCAT GCAAAT7TGG ATAT6GGATT CATTCGAATA TGACAGGAAC CCCATATTGA 8100 

TAAGACACTG TTGCTCCCGG GTGGGCATTG TTCAACTCAA GACTTGAT6A CCCAGATAGG 8160 

TGTGTCTTTG CAGTTAGCTG TCACATGTCC CACCGTTGAA AGGTGGGCTT CTCCTCCACA 8220 

TGTGCAGGGC TCTCTGCCTG CCTTTCCCTT TTCTCGTGTC CTCTGACAGC CTGCTGCCAG 8280 

GATAGAT6AG ATGGGGAGAA ACTTCTCA6A GAGAATAGAG GGGTGTGCAT GGAAACAGAG 8340 

TGTCTTATCA CTATGGGTTG ATATGATGTT TGCAGTTAGC TGCCACATCC TCCCCAAAGA 8400 

CTTCTGGAGG GCATGCCTGG GAACACAATG TTTTATTCAT ATGGGTTGCT GTCCTATTCC 8460 

AATGAATCCC ATATCCAAAT TCCATCAATA TCGCCTTCA6 GAAGCTACAA CATATTCGGC 8520 

TCAATATAAG AAGCACCTTT CTATGATCCT GACATGGGAG AGGCTACCCT GGGGAGTGAT 8580 

CAAGTTTCAA GTCAGAGATT GGCTAACCGT TTGGCAGGAA CGTTGAGGGC GGGAGTGGAG 8640 

ATGGGTGGGG ATATGGTATG GAGGCATCTC ACTACTTTGC 7GTACTAAGA GTTCACATGG 8700 

CGAAACCTGA GAAAAAAAAT TCTACTCTCT GTGTTATATG GGAAGAATAA GGTCAGGTGC 8760 

CAGTGAAAGC TAAAGTCACA AAGAAGCCAA A6GCCCTAGC CAGAACTGTT AAATGAGGCT 8820 

AAGTTTTCTG GCAGCACAGG GTCTATTACA GGGTGTGAGT TTGATTATCC CTGGGATCAT 8880 

GCATGTGTGA TACTCTAATG GGATCCACGT TGGCTCTGAG AAAACACGCA AGGATAAGGC 8940 

CAACCACAGC TCTCCTTTCC CATCCTCTCT TGGGAACAAG TTGAGATTGT CCCAGAAAAT 9000 

GTGGCCCTGA CTTATCTCTT COGAATTCCT TGATTTTGTC CTGTCATGGA GGCCTGGGGG 9060 

ACAGATGGAG GGAATCATGT G^CTGAATCT GAA6AATATT GGAATAGAGA TTCCACAAGG 9120 

TAGGGGCAGG AGAAATAAAG GACAGAAAGG AGAGGAGTTG GTCAAAGAAG GCATCTCAAC 9180 

GTCTAAATGA GAAGTCTTAA TTCGATGTTC AGGGAAAGAA AGAGTAACTT TAGGGACCTA 9240 
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AACAAGGAGG ACTAGCACTA AGACACTGAA GAGATTTCCT GAAATAGACA ATATTTCCAT 9300 
CAGAGACAAT GA6AAATCCC ATCAGGAGAA AATGXCTCTC ACTTTCAGCT CACCCCAGTG 9360 
AAAACAACAA GCATTCTATA AACCATGTAG GAAATGCCCA CACATGCATT ATCTCACCTG 9420 
AGTCCCACTC ACC7GGGAGT GC6GAGACCA GGTGTGGGGG TCTGCAGTCC TTCTAGGGAC 9480 
CATGGAGTGC TCCATCCCTG CCCCTAATCA ATGCTATTCC CACAAGGCA6 ATACTCAGAG 9540 
GGAGAGCCAA GCAGGCTCAT TGCAGTGCAA TAAAGCCAAG AGGCIGGCAG GAGGGAGCAA 9600 
ACACCCGGGT TGGXGA6AGT CCCAGGGAAA GTCTGCCAGT CTGCTCTTTG CTCTGAGAGG 9660 
CAGGGTGGCA GGGTTGGGGC ACTCTGGAAA TATAAAXTTA GTTCCACGAG CTTCTCATCC 9720 
ACAGAGATTT TGATCTGAGG ACATGGTTAA CTGOAGGAGC AATCATTGAC TCAGTAAAAT 9780 
TCtAACTGGA TCTGACCTTA GACA^GGTGt GCGTTTCTQG GCTGGGAAAG TTCCTGGTCT 9840 
GAGGAAGAGT CTCTTGAGAA TGTCATCTCT TTTCAATTAC CCAGCCTTTT GGCCCAGAAT 9900 
GCATCTTCAA ATTAATGAGC CaiTXTGCTG<? TTAATTTGGT CCCAGGGAAA AAAGTCCAGC 9960 
AAATTACTGG GCATTACACT GAGCTTSAAG GTCCCTCTTC AAGGTTGCCC TGGTTTTATC 10020 
AGCTTTCCGA TCAGTCTGGG AAATGGGATG TCTTCAAGGC TGATCAATGC XCTGTTGAAG 10080 
GGCTGGCTGG GAATTTGGGG TATTGGGAGG TTTTCTAGCA TGQAGTACGG CTCCGAGTGG 10140 
CCCAATCCCA AGCCTGGAAG GGCZTCCAGG GGGCTCTAAG TGTGCATTCT GACCTCCACA 10200 
CCTGCCCCTG l!GTGCTCAGC CCTCA6TGTT TGTGClCCCC CTGCAGAGCA GCTCTGCAGT 10260 
GA6GGCAAAG GCTCCTCGCA TCTGGCCCCA GCTCCCTCCA GCCTCAGGTG AGCCCGGTGA 10320 
TGCACCXGTG ATCTCTTCTT CCATGTGAiTG CCCCCTAGCT TTCCCAGGGC AAGTCCGTGG 10380 
ACXTCT7AAG GCZTTCTCTC ACAAGATGAG GAAATGGGCC CATGTCAAGG GCT7AAATGT 10440 
CCTGTTCCAG CCTTTTCACT GTTTCCAGTA AATCAGGGGC TTGTTCTAAA GTTTGTTtTT 10500 
TTXITTCTGG TTATTATATC AGCTTCTGGG TTCTCTCAAA TGGAAGAGTG AGGGAAAATC 10560 
TTCCTTTTTT CCTTTTTTGA GATGGAGTCT TCAGCATCAG TAGCCCAGGC TGGAGTTCAG 10620 
TGGCGAGATC TCGGCTCACT GCAAGCTCCG TCTCCTGGGT TCACGCCATr CTCCTGCCTC 10680 
AGCCTCCC6A GTAGCTGGGA CTACAGGGCC TGCCACCAGA CTGGGCTAAT TTTTTTTGTA 10740 
TTTTTAGTGG AGACGGGGTT TCACCTTGTT AGCCAGGATG GTCTCGATCT CCTGGTCTCG 10800 
TGATCCGCCC ACCTCTACCT CCCAAAGTGC TGGGATTAAC GTGAGCCACC ACGCCT6GCT 10860 
CCTTTTTTCT TTATCTATAC TCirACTATGC TTCAGTTTCC CTGGAAGGTA CATAGAGCCT 10920 
CCT7TTACAG AGAGAACTAG CTCAGAGA6G TCAGTGACCT GCCTAGAGCA GT6CAGAATC 10980 
AGGAGCGGAG CCCAGCCTGG CAGCCTCCAT GGCACAGAGC AAQATGGGCC CCACCGCCTC 11040 
TCTCCTCCAT GTTCATCTTT GGTTTCCTCC TTCCTGGCCT CTGCTCTGCT CCAGCCTT6C 11100 
TAGTGAGTGA CTCCTGAGGA CCTCCTTCTT TGCTGTCCAT CCTAAATAGG GCTGCCCCTC 11160 
TGTCTGCAGC TCTCCCTCCT 6CATAAGAAG CCTTGCGCCC TCCTCTGCTG CCd!GGCTGCT 11220 
TTCAACATCT CGCCCCGCCT CCCCATTGTC TGTGATTTCT CTTCACTCCA CCGAGGCCTC 11260 
AATTTTCCTC AClCCCCTGG GATTTCCCTG TCCCATGTCC CTGGTGGAGT CCCTCAGGGT 11340 
GGGTGGTTG7 CATGGAGTGC TTTCTTCACT CTTTTCTTGG TCCCATCCCA CAAAAGCTCT 11400 
CAAAACATCA CCACACCTGC TQCTGCCCAT GCCCCACAGC CACCCCTGGC AGCCTCATCT 11460 
CAATGATCAG TTCT6GGTTG TGlGTGTGAG TCCTTGGGTG GGGGTGTTTT GGTGCTCTGT 11520 
CATCAGCACC GCTGGGGTAA CTCTCAAGTA TAAGGGGCCA TGTGGGATGC TGGGAGGGCA 11580 
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TCAAAAGACA CA6666ACTT A6TCTTGCTT TCCAAAGGCT TCCA6AGTGA TTGAGGGGCC 11640 
AGGAAACACA CAAGCACATG CATGAAAATG AGCCAACAAA TGCATCAATA TGTACTAAGT 11700 
CTGGCAGCAG CCGAGCTTGG AAAAAGAGAC CAATAGAGCA CTTGCCCGAT 6TG6ACTGAG 11760 
CAAAACTCCC TGGAGGA6AT GAGATCTGGA CCTGTCTGCT GCCTGCTTTG AGTGAGAGAT 11820 
GAAGGCATTT GCCCACAAGC CCTGATGGAC CAAAAACAGA TTCAGGACCA AATGCTCAGC 11880 
CATTGAGATC TTTGGTGCCC GAGAGCTTGA CTATGGGTAG G6ATTT6TG6 CAATGCCGAG 11940 
GCAACCAGAA GACCTTTCA6 AAAAOAGAAG AGTAGAAGTG GGCTTGGAAG ACAGAGAGGA 12000 
ACAGGGATGG AAAGGGAAGA AGAGGGTGAT CAGTCTTGGG CAAAGCACGA GAGCXGAAQG 12060 
GGTCAAGGCT GXGAGGCCGG GGAAGTGGGT GAGCAGGGTA AGATGTAGGT GGTGCTGGTG 12120 
GTGAGAGCAG GCCAATGACA GAAGGAGCCC ATGTGATGCG GCGGGCTTGG ACTCTGGAGT 12180 
GAGGCACGTG GCTTGTCAGT TACCTGCTGG ATGACTTTGG GCACATATTT CAACCTCTAT 12240 
AAACCTAAGA T6CCTTTTCT TTAAAATGGG 6CTAAGAGCT CCCACGACGC AGACTTTGTG 12300 
TGGTATTTAA ATGCAATGTG GCTCCTAACA GCATAGTTGC TGCGTGTA6A TGTTAGTGTC 12360 
TCTTTCTTTC TCATTTTGTC TTTATTTCAT AAATGCACAG TCACTAAGTA AGAAAGGAGA 12420 
GAGTGTGTGG CTCACACTTT CCTGCATGTG GTTCTTCATA TCCCACACAC CACACTGATC 12480 
CTGGGGACAT CACAGGAGAT GACGGGCCTG GTCTGGCAGC ACTGCAGCTC CA6CTCTGTT 12540 
GGGCT6CCTC GAAAGTGGGC AGTGGAAAAA GAAAAGGAGT TTGATTCAAC AATTGGAAGA 12600 
GTCTCAGGAA TTGACTTATG ACTTGGACAC txTTTXTTTT TTTTTTITXT XTTTTTTTTT 12660 
TACTTTTTTG 6GCCTGTGCT CTCACTTCTC TGTGAGGCAG GTTA6ATGAT GTGACCTTTG 12720 
AGGCCCCATG GATGAGAACA TTCTGTAATT CTCT6TGTAC TTGTTTATAG GGCCCAGXTC 12780 
CACTTGCCT6 TCTTTGAGCC TCTTCCCGGT TCAGGGAGGA ATGTCACTTG AATTGAAATC 12840 
A6AAAACCCA GATTCTGCTT CCAGATGTGT CXTTTCCTA6 CCGGA6XGXC XAGAGGAAGC 12900 
CACXXAAXCX CXGAGAAXCA GXXXXCXGXX XCAXGAAAXG GGXXGAGAAC AGCXX6AXX6 12960 
CCXAGXXCXC AGGGCXCXXG XGGGAXGCXC XXXGCAXAXG XGXXXGGXGG GGXGAGCXGX 13020 
GCAAAXGXAA GCXAXGGXGA GGXXXAXGGC ACXXAXXCCX GCXAGXCCXG CAXXXCXCCC 13080 
XXCXCACAGG AGCACCXGGG GXAXGXXXXG CAGCXAAGTX GTCXACCAAX XCCCXGACCA 13140 
XXCAXXCAAA CCTXXGAXXX XXCXGXAXGX CAGXXXCXXA GXXCAAAGAX GGGAGXGXGG 13200 
AXCACXGCCA AG6XCXGXXX XXGGCXGGCA CACACAX6CA CACAAACAXG XGXGCACACA 13260 
AACAXGXGXG CCCAAACAXA CXCACACCCC XCCAAAAXGC XAGAAGGAAX CGAXXGXGCA 13320 
GAACAAXAXG XCXCAXGAGG GAGXAXGCXG AACXAAAAXA AXXXXGAXXG CXXGXCAGAA 13380 
AAXGAXXAGG CAACAGXCAX XACCAXGCCA AGACXGXCCC AGXCXCCAXX GXXCCXAACA 13440 
AGACCXGAAX XACXCAXXCC CXAAAGAGAX GGXXGGXXXA GCAGCCGAAG GAXXXXAGX6 13500 
CXAGACAGAG XCCCAGACAG CAGXGCCACA GXGATGGCGA GGGAGAGGAG XAGCAGGGGA 13560 
GCGGXGAGG6 GCACXXXCXG GAGGAGGGXA XAGGGCAAAA ACXGGGAGGA GAAGAGGGAC 13620 
AAGGXXCAAX AGCGGAGXGC AAX6GAGAG6 ACCGACACAG CCAGCCCGAX XCAGAGCCAC 13680 
AGAGXAAXGG GACCAGAXGA XCXXCACAGA CXCCCXXXCX CCCAXA6AXC XXGCACACCA 13740 
XAGXGGAGAC XXCCCAXGXA C^XCXAXGGX XXGCCACXXA CAGAGXXACX XGGAGCCAGC 13800 
XGAAGXXAGA GCXGGCXXCX CCCCXXXGAG XCXXCAAXXC XGXGXXXAXG XGCAGGCCCG 13860 
GGGACCAXGC CAGGCXXCXA AGAAGGXCXX C6AAXGAAAG XCXGCXXGGG CXCXAGXGXG 13920 
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TCCAGATCTC AGTGCCACTA TTMCCACTa RTATTGATCA AGTGCTGCXC TCCAGGAAGA 13980 
CCCCTGAGGT TTCCTGGTCC AXTCCSAATGC ATGCTGGGTA CTCTTGCACT TGGATGGAAG 14040 
TAAAATCTCC TCACTAAACT CTGTGCCACC AAAATCTCCT TCXCAGTGTG AATTGAAGAA 14100 
ACATTTTCCA AGACTTGCAT GTGCCAGGAG CCAAGGACTC AGAGTGATAA AACAGCGTTC 14160 
TGCCCTCAGA GCTCTCTGTG GTGGGGCGCT dTCCTGTGCTG TCTGGCTTTA CACACAQCAG 14220 
GCAGAATGAC TTGAATTCGG CTGCTGTGCT GTCGTACAAC ATGCTGTTTA GGATCTTGCA 14280 
CATGATAGCT AGGTATTCTt GCXTCAAATC 6CAG6CACCC CACTTACCAA CTGTGTAGAC 14340 
TTGATCACGT TATTCAACCC CT6T6TCTCT GCXTCCTCAT TTTACAAATG GGGAGAAAAA 14400 
TAGCATCTAT CTCAAAGTXG XGAAAAXXAA GCAAGXXAAX ACAXAXGXGC XACGXAGAAC 14460 
AGXGCCX6GT ACAXGGXCAG XXIXXGAXAC AXGXIAGGXA XIAICAXXAX XAXCACCICC 14520 
AGAAACAAXX XAAACXXCXC AXAXAAGGCX CXCCAGACAC CXCXCATXGX CXXCCCXXCC 14580 
AAAXCXGCAX XXAXCXCXCT CXCXXXGCAG XCCAGXGTGA GGCXXGAATC ACCTAXCAAG 14640 
CCXCACCXCC ACCCCXGXGC XXXACAAAAX GICC2XAGAGC XXCXAXXXAC XCGXCTCACT 14700 
GCXCXGXGGG CIXXXXCACT CAAGGGCGXX XGCAXGCTAX CCAXXGCXAC CTGXXXXCXG 14760 
TXGCIGGXGX CXGXCXCCXG CXCXAXCITX GAAGAAAAGA AAGAAGAAAA GGAAXAACXG 14820 
AGAAACAGAG AAAAAAAAXG TCXCXCCCIX CXGGXTCXXC CAGACCACCC ACXCAXCCAT 14880 
CXXGIXCAAX GACAGCXXCX CIXCCIXXAA XXAAXCACI6 XGGXAIAXXX. AXAAAGCXXA 14940 
XAIXXAXGAA AGACCXXXXA AXXXXXXAGX XAXXAAAGCC CTXXCXClfTX GXCAGGXXGX 15000 
AACXGAGXGA GCXCXGGAGT XXGGAAAGAA 6AXCIXAGAA AXGGGCCAGA GAGCXCCXXC 15060 
XGAGAXCGAA GCACGGAGAA XXGCACCXGC XGXGCAX6GX AA6AGAGXGX GCXXGGXAGC 15120 
XCACAAGGGC AAGGXGAGAA XAGAAACXXX CATGCCXXXX T6AXGGGGGX XAXGAAAXCC 15180 
XACCAAGAAA CACCAGGXAX CAGAXGXGGG GXCCXGXXXX CCCAAAGCCA CA3UVXGCXXG 15240 
AAGGAAGAXC MGXGIGAXA AAAXAAXXAC CACAXGAACC AAXCXXGCAX GCACAGCAAX 15300 
XXXGAGAGCC GAXCCXGGGA GCXAGGXGXG XAGXGXIXAT CGXATXGtTG AGGCXCGXAA 15360 
AAAXCIXGXA XGGCXGCA6G CAAGCCAAAC CCXXOACAGG CACXGCAXCX CCGCTGACIC 15420 
XAGAA6ACCA AGCGCAAXXX CIXCCCXGXA lAXAAGGGGA AGXCXCXAXG CXXGGGGXAG 15480 
AGrGAGXGXXX AGCXCCXXCC CXXACXCXAC CXXGCXCCXA CIXXXCXCXA AGXCAACAXG 15540 
AGXCGACAGX XXAGXXCCAG GXCIGGGXAC CGAAGXGGAff GGGGCIXCAG CICIGGCXCX 15600 
GCXGGGAXCA XCAACXACCA GC6CAGGACC ACCAGCAGCX CCACACGCCG CAGXGGAGGA 15660 
GGXGGXGGGA GAXXIXCAAG CXGXGGXGGX GGXGGXGGXA GCXXXGGXGC XGGXGGXGGA 15720 
XXXGGAAGXC GGAGXCXXGX XAACCXXGGX GGCAGXAAAA GCAXCXCCAX AAGXGXGGCX 15780 
AGAGGAGGXG GACGXGGIAG XGGCXIXGGX GGXGGXXAXG GXGGXC5GXGG CXXIGGXGGX 15840 
GGXGGCTXXG GXGGXG^XGG CXXXGGIGGA GGXGGCAXXG GGGGXGGXGG CXXXGGXGGX 15900 
XXIGGCAGXG GXGGXGGI6G XIIXGGXGGA GGXGGCtXXG GGGGXGGXGG AXAXGGGGGX 15960 
GGIXAXGGXC CXGXCXGCCC ICCXGGXGGC AXAGAAGAAG XCACXAXCAA CCAGAGCCXX 16020 
CXXCAGCCCC XCAAXGXGGA GAXXGACCCI GAGAXCCAAA AGGTGAAGXC XCGAGAAAGG 16080 
GAGCAAAXCA AGXCACXGAA CA^CCAAXXX GCCXCCXICA XXGACAAGGX GAGXXXCXCX 16140 
CXCAXIGCaiC XGGXAGGGCX GCCGCXGGXC CACXTGGGAX XGGXGCAGXC AAAACACAXG 16200 
XAGGXXXGAA GCXCAAGXXX CCAXGXXXAC AXGAXXAAAA GGAXGXXXXG XGGAAXGGXC 16260 
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TCCTAGGAGA TATGTTAGAT GTATGCTTGT GAATGGTGTT AATGACTCTC TCTTTGACAA 16320 
AGGGTTCGTG GTCGACCTAA AGGTGGGTCA GTGTGACATT AACATTTAAG TGCTTTTTAT 16380 
TCAGCTCTTG AGCGGAATTG GGACXCATAT CTGTTGAATG AA6ATAATAG AAATGGGGCT 16440 
AACTGAACTT TCCAGGGTGC AAGTGAGAAC CCTGGAAAGG TCTTCCTAAC CATAGAAAGG 16500 
GAGTTGAGTG TGAACATAGT ATAGAGTGTT ATTGTAGCAG AAAACATGTG GTCAGTCAGT 16560 
GCCAAACATC TTTTGCTGTC AGAGGGGAGC TCTGCCTTCT AATAAT*TTA CATTGGTACT 16620 
GGATGAGGCT AGAGTTTTTT TATACTAATA TCTCCAAAAA TCAGCTCTAA AAAACTCAGA 16680 
TAAACCATTT TTTTAATTTT TTGCTTAATC ATTAATAGTG CCAATCCAAG GTTATCCACA 16740 
ACAAATTTCA AATCCAATTT TGAATTTTCC TGATATACTT TTGAAATGTG TGTGTGTCCT 16800 
GGGGATGCAA ACCAGTTTTT ATGGTAATAT ACCTAACAAA ATTTTGGAAG GCAAATCTCT 16860 
TAAATACCAT GCACCTATTtT CAAAACATAA TTGCAATAAT TCTGTATGCG CTTTGCTATT 16920 
GGTATTTGTT TAGTTACTCC CXTCCAAGCC CTCTCTGAAT TAACAAGTTG GGTTTTATTA 16980 
TGCAGATGAT ATTAACXTGA TCATCTTCTT CCTATTTCTC TGTCATGGTC AGAAGATAGG 17040 
AATTGAGGTT CTTTTCCAAA TGAGGCACAG TTCTCCATGG CTATGAGACT CCATTTATGC 17100 
ATCAGGAGTA AAGGGGTCTT GXGTTTTTAG GTGAGGTTCC TGGAGCAGCA GAACCAGGTA 17160 
CTGCAAACAA AATGGGAGCT GCTGCAGCAG GTAGATACCT CCACTAGAAC CCATAATTTA 17220 
GAGCCCTACT TTGAGTCATT CATCAACAAT CTCCGAAGGA GAGTGGACCA ACTGAAGAGT 17280 
GATCAATCTC GGTTGGATTC GGAACT6AAG AACATGCAGG ACATGQTGGA GGATTACCGG 17340 
AACAAGTAAG GGACCCTGTC TGGGCAGTTC TTAACTTTTG CTGTAAAAGA GTTCCAGAAA 17400 
GTAATAAGCT AAGATCATGA AGCAGCATGT AGCTATGTCT TTTCTTAGGT TAGAGGCACA 17460 
TCAGTTTGAC ATTTTCAGAA ATCTTCATTT TCTCAGGAGA TGGAAATAGT CTAGTGGTTT 17520 
TATTGCTCAG TAGAAAGTAG TGGCCAATAT GTCCTAGGTT CATAATAGAA AGGCAGTGAT 17580 
AGGCAATGCC ACCTTTAGTT TAGAATGCTG GACTTCAGGT CTTACCACCT CTGAATCTCC 17640 
TAATTGTTTC TGCTTTCCTG CAGGTATGAG GATGAAATCA ACAAGCGGAC AAATGCAGAG 17700 
AATGAATTTG TGACCATCAA GAAGGTAAGC AAATTCTGTA GGACGGAACT CACATTTGAA 17760 
ATAAATAAGG GAAGAGGGTC TCGAATTACT AAGCAGAAAG CAGCCATGAX ATGGAGAGCC 17820 
AGGTAGTAGA CCTGGGGAGT ATATGGAGTG GGGCTATATT TTTCACATCA TCATGGACGT 17880 
GGACTGATCC AGGCACTTGG CTTCTCCATA TTTCCCAGCA CCTTACATAG TAAGXGGAGT 17940 
GGCAGAXXCX CAGCAAGCCA GGCACACXCC CXXGAXGGXG CXAXCCGGGG GXGGGACAGX 18000 
XAGGGAACXG XGAXXXACCX GGGGCAAAAA GGAGXGGAGT AGACCCAAA6 CXCCXXXXXX 18060 
TGCXXGGAGA AXCCCCXCAC AGGXAAXGAG AGGGACCXGC CCXGGAGAGA ACGXGCCXXC 18120 
AXGAXGXCCC XXGXXCCXCX AGGAXGXGGA XGGXGCXXAT AXGACCAAGG XGGACCXXCA 18180 
GGCCAAACXX GACAACXXGC AGCAGGAAAX XGAXXXCCXX ACAGCACXCX ACCAAGCAGX 18240 
AAGXCXXCCA GXTTTCAACCA AGXXXAXCXA AAXGGAGAGX XXXXAAGCCG GAACCCACAA 18300 
CGAXXCAGAA GAAXAGAXAX XXAXCXXXXA XXXCCXGACX GCIXXCXCXG XCXAAGXXGX 18360 
XXXXXGXXXX AGXGCXGXAA GAGXCACXAA CCXAXXAXGX CXXGCAGGAG XXGXCTCAGA 18420 
XGCAGACXCA AAXCAGXGAA ACXAAXGXCA XCCXCXCXAX GGACAACAAC CGCAGXCXCG 18480 
ACCXGGACAG CAXCAXXGCX GAGGXCAAGG CCCAGXACGA GGAXAXAGCC CAGAAGAGCA 18540 
AAGCXGAGGC CGAGXCCXXG XACCAGAGCA AGGXGAGXGG GCXGAAACCG XAGCCAGXXX 18600 
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CCCTGAftATG GCTTGTCTTG CTATCCXGTG TTATCtCATG TAlGTGTGCC TGTGCCATGC 18660 
T6AGTTCTGC CTACATTTAA CAAACGCTAT CTACCATCTT TAGTATGAAG AGCTGCAGAT 18720 
CACTGCTCGC AGACATGGGG ATAGTGTGAG AAATTCAAAG ATAGAAATTT CTGAGCTGAA 18780 
TCGTGTGATC GAGAGACXTA GATCTGAAAT cdACAATGTC AAGAAGCAGG TATGTGCTTT 18840 
CTCCTTCTAC CACTCAGCTG TATGGAATGG GGGTAACOCT CAGGTAAAGG GCGAGTGCTT 18900 
TCCTAGTTTT 6AATCXTGCA ATTCAGCCCai AGGCTACATT ATTAGCCCTG GTTCCTTTTC 18960 
TGACTATGCT AGTTTCCA6A ATGCAGCCAT CATGCTGGGT TCTCTTTAGG GAAATCTGTG 19020 
AGAATGGCCT AGTAGAGAAA GATGGGATGG TCAAT6T6AG TGATCTAGCC TATGACCCAA 19080 
AGTGGACTTA AGAGTXGGGG AGTGAGAGGA AGGGCAGCCA GGAGG-TTTTA GAGTAGGTGT 19140 
TTAGAAGAAT GTCAAGTCTG TAAGGGTXGT AGGAGCCTXG ACTCAGGGCC AAGA6AGGCT 19200 
GTTGAGTTAT CCCTAAGGTC TTTTAAGGAA GTCAACATGG TGATGTGTTA TCTGGAGGTG 19260 
GGTGTGAGAT GACTTAAGGC CAA6TGGXTC TGTTGGACTC ATTATTGGCC TCACTGGAGT 19320 
GGGGAGACCA ATTGGGAXGA GGAGGCCTAG TGGGGAATGC ATATTATGAG AGGGTGTCAT 19380 
ATCTTTTTCA 6ATCTCCAAC TTGCAGCAGT CCATCAGTGA TGCAGAGCAG CGTGGC6AGA 19440 
ATGCCCTCAA GGATGCCAAG AACAAGCTGA ATGACCTGGA GGATGCCCTG CAGCAGGCCA 19500 
AGGAAGACCT GGCCCGCCTG CXGCGTGACT ACCAGGAGCT GATGAACACC AAGCTGGCCC 19560 
TGGATCT6GA GATTCCCACC TACAGGACCC TCCTGGAGGG AGAAGAAAGC AGGTGAGGAA 19620 
GGGACGCTGG GAGT06AACC TCTTCTCATG GTCTTCCXTC CTTGCAAGCX GATTGTTGTT 19680 
GAAGATGCAG CCATCTGATT GCAGCTTGTG CTGGGTATCG GGAAATGAAA AGTACACGGA 19740 
GCA6GAGGAA GGACCTAGTT TTACTTTGGG AGCTGGAGTC CCAAGCTGTT TATTTTTTTC 19800 
TTCTAGGGCX GXAACATATC TAGAAAGA6C TTTGAG6TGG AGCAAATTAT TCTTTATCTG 19860 
GGCTGCCTCA GATGGCAGCT GGCCTAAAGT CGGCATCTTO! AGAGGGGGCC TtCATTGGCT 19920 
GCAAGGCTCG TCTCGTTTAT ATGGGAATTT CTCCGT6TTT 6TACTCTTGC TGAGAAAAAA 19980 
TGACAGGTCT GGGAGGCCAG AGGGGATXGG ATTAAGTTTC AGATTAAGTG CATTGGAQAA 2004O 
GACCCAGATG GGGAAAGTCT TCAAGGT6GT GGAGCGGGGA AtGGGGAAGC GGTXTGGGAA 20100 
GCTGGAGTGT CCTGAGGAAT TTTCTTATTT TCTCCTACAG GATGTCTGGA GAATGTGCCC 20160 
CGAACGTGAG TGTGTGTAAG TACAAGTCX^A TMCTCAGGG GCATGTGCAG GCTTTGTTGG 20220 
GCTGGAAACG GAGTXGAGGl! TGAAAATAAC TGAGCTTCCX CTTGCAGCTG TGAGCACAAG 20280 
CCACACGACC ATCAGXGGAG GXGGCAQCCG AGGAGGXGGC GGCGGXGGCT ACGGCXCXGG 20340 
AGGXAGCAGC XAXGGCXCOG GAGGXGGXAG CXATGGXXCX GGAGGXGGCG GCGGCXM3CGG 2040O 
CCGXGGCAGC XAXGGCXCCG GAGGXGGCAG CXATGGCXQX GGAGGXGGCG GCGGCGGCCA 20460 
XGGCAGCXAC GGCXCCGGAA GCAGCAGXGG GGGCXACAGA GGXGGCXCXG GAGGCGGCGG 20520 
CGGCGGGAGC XCXGGCGGCC GGGGCXCXGG C6GC6GGAGC TCXGGAGGCT CCATAGGAGG 20580 
CCGGGGAXCC AGCXCIGGGG GXGXCAAGXC CXCXGGXGGC AGXXCCAGCG XGAAGXXXGX 20640 
XXCXACCACX XAXXCCGGAG XAACCAGAXA AAGAGAXGCC CXCIGXXXCA XXAGCXCXAG 20700 
XXCXCCCCCA GCATCACXAA CAAATAXGCX TGGGAAGACC GAGGXCGAXT XGXCCCAGCC 20760 
XXACCGGAGA AAAGAGCXAX GGXXAGXIAC ACXAGCICAX CCIAXXCCCC CA6CXCXXXC 20820 
XXXXCXGCXG XXXCCCAAXG AAGXXXXCAG AXCAC^XGOCA AXCXCAGXCC CCXGGCXAXG 20880 
ACCCXGCIXX 6XXCXXXCCC XGAGAAACAG XXCAGCAGXG ACCACCACCC ACAX6ACAXX 20940 
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TCAAAGCACC TCCTTAA6CC AGCCA6AGTA GGACCAGTTA GACCCA6GGT GTGGACAGCT 21000 
CCTTAGCATC TTATCTCTGT 6CTGTTTT6G TTTTGTACAT AAGGTGTAAG CAAGTTGTTT 21060 
TTCTTTTGTG GAGAGGTCTT AAACTCCCCA TTTCCTTGTT TTGCTGCAAT AAACTGCATT 21120 
TGAAATTCTC CATGTCTCX3A TCGCCCTTGT TTACGGCACT GTCTAACCTG 6ATGGGTGTT 21180 
TTGTGAGGTA AAAGAAGACA CTAGAGCCAC ATGGCATATG GGAAAGTCAT GCACACAAAC 21240 
ATGAGAAAAA TGCAGAGGCC AACCAGGCAA CATTTCACCA GACTGGAATC ACAGAGAGAG 213O0 
GAAGCACTTT CCCAGATGGT GGGGATGTCA TGGAGAAATG GAGAGACCGG GTGACAGGTT 21360 
TTGTTCATTT GAGAAGGCTT TCTTGAAAAG G6CAGT6AGC AAGCAGGTTG GGAG6AAGAG 21420 
GTGTGGCATT GAGAAGAAGG GAAAGTATTG CATGAAAAAG TAATTCTTCA CGTGGAACAG 21480 
CCAGTAAGGA GGGGCATGAG TAATATAGGG TCAGCAGTTA CTGGAGCCAG AATACAGACT 21540 
TTGGCCTGGG GAGTTCAAGA ACTAAGAGTG GTAATAGAGA GTTGGATATT CCATTTCCCT 21600 
TCTCTTTTTG TGCCACCACC CAAAGCTCTG CATAATCTAA GAAGTTCCCT TGTTGACACA 21660 
TAGCTCATAC TTGTGAAGTT GTACAACAGG ATAGCATAGT GGCCAGAAGC ATGGACAGTT 21720 
GAACTCAGAT ATGCTTGGGT TTGAATCTTA CCATCACCAT TTACTAGTTC TGTAATACAG 21780 
TGCAAGTTAC AGACATCTCT GCACCTCAGT TTTAGTATGT CTAAATXGGG GATGATAAXG 21840 
CCTTCCTTGT GGGGATAGTG TGAGGATTGA ATAAGATGAA TACACATGGC TGAGCACACA 21900 
GCAAGCACTA AATAAGTGCC AGTTTTAATG ATAACGGTGA TGATGATGAT GATGATGATG 21960 
ATGACGTAAC ATTGCTTGTG GGACTCCATA CAGCTCAGTA GATGCTTGCT CAAAGAAGCA 22020 
AGTTACCAAA ATTTTTGTAA TGGTTCTATG AACGTGAAAA AAGCAGTCAA CTTCTCTGAG 22080 
GATCAATTTC CTTAGTTTCC AATTAGGAAA AGTCTTCTTA GCTCCAGAGT CCCACAGGGC 22140 
TAATGGAATA AGGAGAGGAT AGATCACACA TGTATTATGC AAACACAACT CAGGTGAGCT 22200 
CTATTCTTCC TTCTCAGTTA TCCCTTCTGT AGGGACCCCA GTGTCCCCTG CTGTCTTTCT 22260 
GTGTCCTGAC CGGGAAACAC AGTGTGCCTT GTCTACTCCA TCACTTGGCC AGCTGCATGC 22320 
TTTCCTTTGC AGGCTTGAAG CAAAGCTGGG TCTCGGACAT TCTCAGGCAC TGACAAAGCT 22380 
GTTTAGTTGT TGCTGGGAAA CACTGGGAAA TAGCCCTTTT GTTAAACACA CAGAAACTAG 22440 
CCTTCGCCCT GAGCCAAATT CCTTAAACTC GTCTATGAAA TTCCATAACC TGACTCCTTA 22500 
ACTGCAGACA TACCCAGCTA GAACATCCCT CATGTCCCTG TCCACCGTGA GAATGCTGCA 22560 
CTTCACTCTG AACCTTTAGT CCTCCTTTTA AATACTGCAC ACTGATCACC CTGGTGTTTA 22620 
GTGCTTTGTT TTTTGGAATC CCACCTGGCT CCATTTTGGG ATGGTTCCGG GCACTTCCCT 22680 
ATGGAAATTC CCCTGCTGTC ACTGTCAGAG TGAGTCCAGC AGTGGGTTTA GCTGGATGAA 22740 
ACACCACCAT GTCCATTTCC ATTCAGACTA ATGTCAGAAT TTGAAAGGCA CTATGGTAGA 22800 
GTAGAAAGAA CAAGGAACTG TACTATTTAA AGGGCAGGCA AA6AAAAGGC ATCTATAGCT 22860 
TATAAGATGT GTGGAtCTTT GGATGTGACT TGGCCATCCT GAGCCTAAGT TGTCTTGtAG 22920 
GAGAAATGGG AATGAGAATA TTTTCCTCTA GACATCAAGA GGAAAAGAAA TATAACGTGA 22980 
AAACCTTTGT GAATTGTGAA TGTGTTATAC AGAGTAGCTA AAAGAATXAA AAAGGGAGTG 23040 
ACAAAAAAGT AAAAGGCAGC TGGCTGCTCA GG6CCTCCAT GGAGGGAAGT ACCTTGATAT 23100 
GGTCACTGTG GCTCAGTGAC AGCTCTGCAG GGACAGGAAA TTGATTTGTT AGTGCACCCA 23160 
AAGOTGAATC TGCTCCTGAG TACTGATTTA TGGGAACCAA ACACACAAGA GATGAAGGAT 23220 
GTGTCAACCA GAATGTCCAG CATTAGCTTG TGGGGAAACA CATACTTCCA GTGACTGAAA 23280 
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TACCATCCTG TTATCAAGAG ATCTGGGAAA CTAftftGTACT GACAAGAGCT GGCTTGATCT 23340 
GTGGATOrXAG AACAATGAGA GTTAGGTGGC CTTGAGGGAG AT6ATTCACT CTCCTTCACA 23400 
GAAGAGCTGA CCTCTGGGGT CAACAGATAT AGCACCTCTT TCCCAGGGAC GCTACTGAAT 23460 
GAACAGTGAT GTGTTCTTAT ACTCTG6CCC AGATTTTCTA CATACTTTCT TAGGTTACAA 23520 
CTTTATTTAG TCACATTTCA GTACTGGGGA TACTCCXGTT TATCTTCTTT GGACTCGAGT 23580 
TTTTATGGGA AGGTCATGAA ACAGAGAAAA ATACAATTTG CAGGGAAACT TACCAAGGCT 23640 
TGTAAGGTTA CAAGGATTAA ATGAAAACCC TGTGTAAGTC AGTATATAGT GAAGAAGTAA 23700 
ATTGAGCTAG ACCAAACGCC AAAATGCATC CGCATTAGAA AGACGATAAA GGAA6ACTCT 23760 
GGATTCAGTT CTGTTCAAAA AACATTTTCT GCACAAATAC TATCTATGAG GAACTGGGCG 23820 
TTGGGGAGAT GATGATGAGT GAGACATGGt XCTTGCMTC AGAGAGCCTA GAGACCTGGG 23880 
TGGTAGCAAT GGTAGAGATA CATCCAAGAC ACAGAAATAG ATATACAGGA ACACAGATGA 23940 
TTGAAA6TGA TGCTTGGCAG GGCTTTAAAG AATGAATCAG AGTTTTTCAG GCAGACGAGG 24000 
ATCTTCAAGG CAGAGGGAAT CATATAGATA AGGACATAGA AGAGTGAAAT TTCATGAAGT 24060 
AGXTAAGCAT CTGAAGAAGC ATGGAATTAG TGACAAGAAA TGAtGCGGAA AAGATATCCA 24120 
GATCCAATCA AGAAGGGCCT TGirGGCATT CTATGGAGTC TGGAGTTTGG CTTCTGGGTC 24180 
ACAAGTTCTC AGATGGGGTT TTCATATCTA TTATTAGACC TACTATGTAC TGGTCCAGTG 24240 
GAAGGGAAAG GGGTTGTCOT ACTGCtAGTG GAGTAGGAAT TGGGTATGGA CCACAGCTTG 24300 
TCTTGTTTCC AAGTATTCCC TAAGAAATC5T GGTCTGCTGA TGGGAGAXCT AXTTATGGAA 24360 
ATGTCTTTTT CCCTCAGGAA TTTTATGTCG GAAACAGCTG TCATAGGTGA GGAGGAACTG 24420 
GTAAAAGTAC TTAATAGGAG AGTGTCATGG TCAGATTGGT GTTTTGGAAA AGTCAGCCAG 24480 
GGCAGATTGG AGAGGTCCAT ATT6GAGGCA GGAAGACTTA AGAGACTATT GCAAAGGTGA 24540 
AGACAAAAGA CGATAGGGAC OTGCACTTTA ATTCCAGCCC TTAGAAGTAG TAGAAGGTCA 24600 
GAAATGAGAA TATGCATTAC AGAGATAGTT AGTTGCTATA TCATTAGGAC TTGGTGATAG 24660 
AMGGATGAG GAT6CG6XTG GGTGAGGCAA AGAGGAGAGX CCACATTCCT GGTCTGGGXA 24720 
GTAACAAAGA ATClTAGCAAG AGGGCTTGTO GGGAAAGATG CTGAGTTACG TAGCAAGTGC 24780 
ATCTGCTTTA TCCTTGTAAX GAAXGGGGCX AAAGGXGXAA ACXIRAAGAGX CAXCAGCAXX 24840 
X6GAGGGXAG AAXAAAXCAX CAGAXAACXC AGGAAGAAGG AGCAGAAGAA XXACXGAXAC 24900 
TCCCXGGAAG GAAAACCGGA AGXAAAXGGG AGAAACXXGC XCAAGXGGAC AAAGXXXAAC 24960 
AGACAX6AAG CAXGAAXXC 24979 

(2) INFORMAXION FOR SEQ ID NO: 4 1 

(1) SEQUENCE CHARACFEHrSXICS: 

(A) I£NGXH: 8 amln& adda 

(B) XYPE: amino acid 

(C) SXRANDEDlteSSs single 

(D) XOFOLOGr: i linear 

(iij MOI.ECULE XyPE: peptide 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Ser Arg Lye Ser Tyr Lys His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESSs single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Met Ser Ser Val Lys Phe Val Ser Thr Thr Tyr Ser Gly Val Thr Arg 
1 5 10 , IS 
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CLAIMS 

What we claim is: 

A keratin Kl vector for expression of a nucleic acid cassette in the 
egid^^us^coinprising: 

a 5' flanking region of the keratin Kl gene, said 5' flanking 
sequence including a keratin Kl promoter, a 5* transcribed but 
untranslated region and a first intron all in sequential and 
positional relationship for expression of the nucleic acid cassette; 

a 3' flanking region of the keratin Kl gene containing 
Vitamin D3 regulatory sequences, said 3' flianking region including 
a 3' transcribed but imtranslated region and contiguous non-coding 
DNA containing the transcriptional termination region; and 

a polylinker having a pixurality of restriction endonuclease 
sites, said polylinker connecting the 5' flanking region to the 3' 
flanking region and said polylinker further providing a position for 
insertion of the nucleic acid cassette. 

The keratin Kl vector of claim 1, wherein the 5' flanking region 
is approximately 1.2 kb and the 3' flanking region is approximately 
3.9 kb. 

The vector of claim 1, wherein the nucleic acid cassette includes a 
nucleic acid sequence coding for a protein, polypeptide or antisense 
RNA. 

The vector of claim 1, wherein the cassette includes a nucleic acid 
sequence coding for an oncogene. 

The vector of claim 4, wherein the oncogene is selected from the 
group consisting of ms, fbs^ myc^ erby s/tc, sis andjun. 
The vector of claim 1 wherein the cassette includes a nucleic acid 
sequence coding for a transforming gene. 
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The vector of claim 1, wherein the restriction endonucleases are 
selected from the group consisting of Bam HI, Xma I, Kpn I, Not I 
Cla I and Bgl II. 

The vector of claim 1, wherein the nucleic acid cassette contains the 
E6 or E7 transforming sequence of human papilloma virus. 
The vector of claim 1, wherein the nucleic acid cassette contains the 
TGF-a sequence. 

The vector of claim 1, further comprising an additional 5' flanking 
sequence from the 18 kb Eco RV fragment on the end of the vector. 
A bioreactor comprising transformed epidermal cells including the 
vector of claim 3. 

The bioreactor according to claim 11 wherein the vector includes a 
cassette having a nucleic acid sequence coding for a protein or 
polypeptide selected from the group consisting of a hormone, a 
growth factor, an enzyme, a drug, a tumor suppressor, a receptor, 
an apolipoprotein, a clotting factor, a tumor antigen, a viral 
antigen, an insect antigen, a bacterial antigen and a parasitic 
antigen. 

The bioreactor of claim 12, wherein the nucleic acid sequence 
encodes proinsulin or insulin. 

The bioreactor of claim 12, wherein the nucleic acid sequence 
encodes growth hormone. 

The bioreactor of claim 12, wherein the nucleic acid sequence 
encodes insulin-like growth factor I, insulin-like growth factor II or 
insulin growth factor binding protein. 

The bioreactor of claim 12, wherein the nucleic acid sequence 
encodes a clotting factor. 

The bioreactor of claim 12, wherein the nucleic acid sequence 
encodes an epidermal growth factor (TGF-a), a dermal growth 
factor (PDGF) or an angiogenesis factor. 
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18. The bioreactor of claim 12, wherein the nucleic acid sequence 
encodes a Type IV collagen, laminin, nidogen, or Type VII collagen. 

19. The bioreactor of claim 12 for vaccine production, wherein the 
cassette includes a protein which induces an immunological 

5 - response. 

20. A method for ex vivo introduction of a keratin Kl vector into 
epidermal cells comprising the steps of contransfecting the vector 
with a selectable marker and selecting the transformed cells. 

21. A noncoding fragment of a htmian keratin Kl gene containing 
10 regulatory sequences of SEQ. ID. No. 1. 

22. A human keratin El vector having: 

a 5' flanking region comprising nucleotides 1 to 1246 of 

SEQ. m. No. 1^ 

a 3' flanking region comprising nucleotides 6891 to 10747 of 

15 SEQ: ID- No. 1; and 

a linker comprising nucleotide 2351 to 2376 SEQ. ID. No. 2 

23. A method of making transgenic animals comprising the steps of: 

collecting very early fertilized eggs^ 

inserting the vector of claim 3 into said fertilized eggs by 
20 micro-iiyectingthe vector into pronuclei; and 

transferring said iiyected eggs into pseudopregnant recipient 
fenLales. 

24. A transgenic animal containing the vector of claim 3 in its germ 
and somatic cells, whetein said vector was introduced into said 

25 flniT pni or an ancestor of said animal at an embryonic stage and the 

nucleic acid cassette of said vector is only expressed in the 
epidermis. 

25. The transgenic animal of claim 24^ wherein the nucleic acid cassette 
contains a transforming gene sequence. 
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The transgenic ATiiTnal of claim 25, wherein the oncogene is selected 
from the group in table 1. 

The transgenic flninifll of claim 24 wherein the animal is a rodent. 
A method of stud3ring the origin of or treatment for cancer 
comprising the steps of: 

making a transgenic animal by iiyecting an embryo with a 
hmnan keratin Kl vector containing an oncogene in the nucleic acid 
cassette; 

using the restdtant animal or the progeny in studies of 
cancer. 

The method of claim 28 wherein the nucleic acid cassette contains 
the fos oncogene sequence. 

The method of claim 28 wherein the nucleic acid cassette contains 
the ras oncogene sequence. 

The method of claim 28 wherein the test animal contains more 
than one oncogene. 

A method for in vivo transduction of epidermal cells with a keratin 
Kl vector comprising the step of contacting the vector with 
epidermal cells for sufficient time to transform the epidermal cells. 
A method for transient introduction of a nucleic acid cassette into 
a human, comprising the step of contacting a keratin Kl vector 
containing said cassette with an epidermal cell for sufficient time 
to transduce the cells with the vector. 

A method of enhanced healing of a wound or surgical incision 
comprising the step of in vivo transduction of epidermal cells with 
a keratin Kl vector, wherein said vector includes a nucleic acid 
cassette having a nucleic acid sequence for a growth factor. 
The method of claim 34, wherein the epidermal cells are transduced 
with a plurality of vectors and wherein the cassette of at least one 
vector includes the nucleic acid sequence of epidermal growth factor 
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(TGF-a), the cassette of at least one vector includes dermal growth 

factor (PDGF)» the cassette of at least one vector includes the 

nucleic acid sequence for a matrix protein to anchor the epidermis 

to the dermis and the cassette of at least one vector includes the 

5 nucleic acid sequence for an angiogenesis factor. 

36. The method of claun 35, wherein the sequence for the matrix 

protein^ is selected from sequences coding for a protein selected 

from the group consisting of Type IV collagen, laminin, nidogen, 

and Type yn collagen. 
10 37. The method of daim 35, wherein the angiogenesis factor is selected 

from the group consisting of acid fibroblast growth factor, basic 
fibroblast growth factor and angiogenin. 

38. A method of treating skin' ulcers comprising the steps of i/i uii/o 

transduction of epidermal cells with a keratin Kl vector, wherein 
said vectors include a nucleic add cassette having a nucleic acid 

sequence for a growth factor. 

39. ThemethodofcIaim38,whereintheepidermalcellsaretransduced 

with a plurahiy of vectors and wherein the cassette of at least one 
vector indudes the nucleic acid sequence of epiderxnal growth factor 
(TGF-«)r the cassette of at least one vector includes dermal growth 
factor (PDGF), the cassette of at least one vector includes the 
nudeic acid sequence for a matrix protein to anchor the epidermis 
to the dermis and the cassette of at least one vectot indudes the 
nucleic add sequence for an angiogenesis factor. 
25 40. The method of claim 39, wherein the sequence for the matrix 

protein is selected from sequences coding for a protein selected 
from the group consisting of Type IV collagen, laminin, nidogen, 
and T^rpeVn collagen. 



15 



20 
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The method of claim 39. wherein the angiogenesis factor is selected 
from the group consisting of acid fibroblast growth factor, basic 
fibroblast growth factor and angiogenin. 

A method of enhanced healing of a wound, surgical incision or skin 
ulcers in humans and animals, comprising the steps of: 

ex vivo transduction of epidermal cells with a keratin Kl 
vector, wherein said vector includes a nucleic acid cassette having 

a nucleic acid sequence for a growth factor, and 

transplantingsaidtransducedepidermal cells into the animal 

or human to be treated. 

The method of claun42. wherein the epidermal cells are transduced 
with a pluraHly of vectors and wherein the cassette of at least one 
vector includes the nucleic acid sequence of epidermal growth factor 
(TGF-o), the cassette of at least one vector includes dermal growth 
factor (PDGF), the cassette of at least one vector mcludes the 
nucleic acid sequence for a matrix protein to anchor the epidermis 
to the dermis and the cassette of at least one vector includes the 
nucleic acid sequence for an angiogenesis factor. 
The method of claim 43, wherein the sequence for the matrix 
protein is selected from sequences coding for a protein selected 
from the group consisting of Type TV collagen, laminin, nidogen. 
and Type Vn collagen. 

The method of claim 43, wherein the angiogenesis factor is selected 
from the group consisting of acid fibroblast growth factor, basic 
fibroblast growth factor and angiogenin. 

A method for treating psoriasis comprising the step of in vivo 
transduction of epidermal cells with a keratin Kl vector, wherein 
said vector includes a nucleic acid cassette having a nucleic add 
sequence coding for a protein or polypeptide selected from the 
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group consisting of TGF-P, a soluble form of cytokine receptor, and 

an antisense RNA. 
47. The meaodofdaim 46 wherein the cassette contains the sequence 

for TGF-p. 

5 43, The method of claim 46 wherein the cassette contains a soluble 

form of <ytokme receptor selected from the group consisting of IL- 
1, IL-6 and IL-8. 

49. The method of daim 46, wherein the cassette contains an antisense 
RNA to a sequence selected from the group consisting of TGF-a, 

10 IL-1, IIr6 and IL-8. 

50. A method for treating skin cancer comprising the step of in vivo 
transduction of epidermal ciells with a keratin Kl vector, wherein 
said vector includes a nucleic acid cassette having a nucleic acid 
sequence coding for an antisense UNA for the E6 or E7 gene of 

15 human papilloma vfrus. 

51. A method for treating: skin cancer comprising the step of in vivo 
transduction of epidermal cells with a keratin Ki vector, wherein 
said vector includes a nucleic acid cassette having a nucleic acid 
sequence coding for the normal p53 protein. . 

20 62. A method for vacchiation comprising the step of the in vivo 

transduction of epidermal cells with a keratin Kl vector, wherein 
said vector includes a nucleic acid cassette having a nucleic acid 
sequence coding for a protein or polypeptide which induces an 
immunological response. 

25 53. The method of claim 52, wherein the cassette includes a sequence 

for a viral capsid protein. 
54. The melhod of daim 53, wherein the capsid protein is from the 
human papillonia virus. 
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BOX n, OBSERVATIONS WHERE UNITY OF INVENTION WAS LACKING 
This ISA found multiple inventions as follows: 

I. Claims 1-22 and 42-45, drawn to a keratin Kl vector, transfonned ^idennal cells containing the same and a method 
of wound healing which comprises the ex vivp transduction of epidermal ceils with a keratin Kl vector, classified in 
Classes 435 and 424, subclasses 320. 1 and 93B, respectively, for example. 

0. Claims 23-31, drawn to a transgenic animal containing a keratin Kl vector and a method of using the same, 
classified in Classes 800 and 424, subclasses 2 and 9, respectively, for example. 

m. Claims 32-41 and 46-54, drawn to methods of wound healing, treating skin ulcers, treating psoriasis, treating skin 
cancer or vaccination, all of which edmprise the in vivo transduction of epidennal cells with a keratin Kl vector, 
classified in Class 514, subclass 44, for example. 
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